YOLO Implementation and Performance Analysis Pipeline

Raffay Hassan
Feb 16
3 min read

Updated: Mar 30

Phase: 1 (Simulation Baseline)

Focus: YOLOv8n inference + lane-relevance filtering + performance reporting

Overview

This blog documents the YOLO implementation used , including the offline playback pipeline designed to evaluate detection behaviour and real-time feasibility. The aim was not only to run detections, but to quantify performance (latency/FPS) and reduce irrelevant detections using a lane-focused filtering approach.

Why YOLO in This Phase?

YOLO provides fast object detection and is commonly used in autonomous perception. The objective is to:

Run YOLOv8n across recorded scenario frames
Measure inference speed per frame
Track detection counts and class distribution
Reduce false relevance by isolating detections in the ego lane area
Generate evidence (CSV + graphs) suitable for dissertation reporting

Implementation Summary

The analysis tool loads:

Recorded RGB camera frames (camera/*.jpg)
Logged run outputs (sensor_fusion.csv, vehicle_state.csv)
LiDAR point clouds (lidar/*.ply)

It then performs:

Lane-area estimation from the image
YOLO inference on each frame
In-lane filtering of detections
Performance logging (ms/frame, FPS, detections/frame)
Output generation (CSV + graphs)

Figure 1: Detections in Ego lane only (YOLO)

Lane-Relevance Filtering (Reducing Noise)

A key improvement is lane filtering: the system detects the drivable region and prioritises only objects that are likely relevant to collision risk.

Lane Mask Creation

Convert image to HSV
Apply colour thresholds targeting road-like surfaces
Apply trapezoid ROI (perspective lane region)
Use morphology (open/close) to clean the mask

In-Lane Object Test

For each bounding box:

take bottom-centre “ground contact” point
check pixel region around that point in the lane mask
accept detection if lane coverage exceeds a threshold

This enables two classes of outputs:

In-lane detections (highlighted, high priority)
Out-of-lane detections (faded, lower priority)

Performance Metrics Logged

For every frame, the pipeline logs:

Frame index
YOLO inference time (ms)
Number of detections (in-lane only)
Detected classes list per frame

Graph Outputs

The tool produces performance visualisations, including:

Inference time over frames
FPS over time
Objects detected per frame
Top detected object classes
Inference time distribution histogram
Detections vs inference time scatter
Cumulative detections over time
Summary box with mean/median/min/max stats

Figure 2: YOLO Detection & Performance Metrics

YOLO Inference Time

Mean inference time: ~9.2 ms

This corresponds to:

~114 FPS processing speed

One initial spike (~900ms) likely represents:

Model warm-up
First-time GPU memory allocation

After warm-up, inference stabilizes.

This shows:

YOLOv8n runs efficiently and consistently for real-time perception.

YOLO Processing Speed (FPS)

Mean FPS ≈ 114.5

This is significantly higher than:

CARLA simulation tick rate
Real vehicle control frequency

Meaning:

Object detection is not a bottleneck in the system.

Objects Detected Per Frame

Average ≈ 0.12 objects per frame

Why low?

Because:

I filtered detections to only objects in ego lane
Static obstacle was primary detection target

This confirms:

Lane-based filtering effectively removes irrelevant detections.

Top 10 Detected Classes

Dominant class: Car

Other minor detections:

Person
Stop sign
Random background objects

This shows:

YOLO correctly prioritizes relevant traffic participants.

Radar–LiDAR Agreement vs YOLO

The comparison graph shows:

Sensor fusion distance over time
YOLO detection counts per frame

Observation:

YOLO detects obstacle consistently when in camera view.
Fusion detects obstacle regardless of visibility.
During braking, both systems align temporally.

This demonstrates:

Camera perception complements sensor fusion but fusion ensures safety redundancy.

Outcome

This Phase 1 YOLO pipeline delivers:

Repeatable inference evaluation
Quantified performance evidence (ms/FPS)
Relevance filtering to reduce clutter
Report-ready CSV outputs and graphs

It acts as the perception evaluation foundation used to validate feasibility before expanding the project scope.

Github : https://github.com/rh960/Sensor_Driven_Digital_Twin_For_Collison_Prevention_in_Autonomous_Systems