YOLOv5n vs YOLOv8n: Generational Performance Evaluation in CARLA Simulation
- Raffay Hassan
- 6 days ago
- 8 min read
Phase: 1 (Simulation Extended Model Evaluation)
Focus: Architecture comparison + adverse weather robustness + real-time stability
Overview
Following the initial YOLO implementation documented in the collision avoidance scenarios, this analysis compares two generational models: YOLOv5n (2020) and YOLOv8n (2023). The objective is to evaluate whether the newer YOLOv8 architecture provides meaningful improvements over the mature YOLOv5 for real-time collision detection. Both models were evaluated using identical lane-relevance filtering pipelines across the same CARLA scenarios: normal daylight conditions and extreme rain with reduced visibility.
Architectural Context
YOLOv5n (2020):
Backbone: CSPDarknet (Cross-Stage Partial connections)
Neck: PANet (Path Aggregation Network)
Head: Anchor-based detection with predefined boxes
Parameters: 1.9M
Maturity: 4 years of production deployment and optimization
YOLOv8n (2023):
Backbone: C2f modules (improved gradient flow)
Neck: Enhanced PANet with C2f
Head: Anchor-free detection with distribution focal loss
Parameters: 3.2M
Maturity: 1.5 years, newer architecture design
The key architectural advancement in YOLOv8 is the anchor-free design, which eliminates predefined anchor boxes and potentially improves generalization to unusual obstacle sizes and aspect ratios encountered in adverse weather.
Test Scenarios
Scenario 1: Clear Conditions (Good Lighting)
Daylight environment, Town04
High visibility, clean camera frames
Dry road surface
Total frames: 2331
Scenario 2: Extreme Rain Conditions (Low-Light)
Heavy precipitation with fog
Wet roads, water droplets on camera
Low sun altitude darker lighting
Reduced visibility
Total frames: 1185
Both scenarios use identical obstacle placement, vehicle speeds, and sensor configurations.
Lane-Filtering Implementation
Both models apply the same lane-relevance filtering:
Lane mask creation using HSV colour thresholds for road detection
Trapezoid ROI matching perspective lane geometry
Bottom-centre ground contact point test per detection
Only objects with contact points in ego lane are counted
This ensures evaluation focuses on collision-relevant detections.
Results: Clear Conditions (Good Lighting)
Performance Comparison
Metric | YOLOv5n | YOLOv8n | Difference |
Car Detections | 129 | 133 | +3.1% |
Mean Inference Time | 20.5ms | 19.8ms | 3.4% (faster) |
Median Inference Time | 20.2ms | 19.9ms | 1.5% (faster) |
Mean FPS | 49.4 | 50.6 | +2.4% |
Min FPS | 1.5 | 10.9 | +627% (stability) |
Max FPS | 52.8 | 54.0 | +2.3% |
Max Inference Time | 673.90ms | 91.33ms | 86% (stability) |
Detection Rate | 5.5% | 5.7% | +0.2% |
Unique Classes | 1 (car only) | 1 (car only) | Equal |

Graph Analysis: YOLOv5n Clear Conditions
Inference Time per Frame (Top Left):
Mean: 20.5ms, Median: 20.2ms
Mostly stable around 20ms baseline
One catastrophic spike visible reaching 673ms
This spike represents a 33× slowdown compared to typical performance
FPS Over Time (Top Right):
Mean: 49.4 FPS, relatively stable around 50 FPS
Min FPS: 1.5 this corresponds to the 673ms spike
Generally maintains real-time performance with occasional severe degradation
Objects Detected per Frame (Bottom Left):
Mean: 0.06 objects per frame
Clean detection pattern with only 3 major detection events
Lane filtering working correctly only in-lane cars counted
Top 10 Detected Classes (Bottom Right):
Car: 129 detections - the collision target
Unique Classes: 1 (perfect lane filtering in good conditions)
No false positives from background objects

Statistical Analysis: YOLOv5n Clear Conditions
Inference Time Distribution (Top Left):
Highly concentrated distribution around 20ms
Small secondary peak around 23ms
Clean histogram showing consistent performance when stable
Detections vs Processing Time (Top Right):
Scatter plot shows no correlation between detection count and inference time
Trend line nearly flat: y=0.15x+20.52
The catastrophic spike occurs independently of detection complexity
Cumulative Detections (Bottom Left):
Total: 129 cars detected
Steady accumulation through scenario
Plateau periods where no cars are in lane (expected behavior)
Performance Summary (Bottom Right):
Mean inference: 20.52ms
Std Dev: 13.56ms (high variability due to spike)
Max: 673.90ms this represents the critical failure mode
Detection rate: 5.5%

Graph Analysis: YOLOv8n Clear Conditions
Inference Time per Frame (Top Left):
Mean: 19.8ms, Median: 19.9ms
Extremely stable performance
No catastrophic spikes visible
Consistent 20ms baseline throughout entire run
FPS Over Time (Top Right):
Mean: 50.6 FPS, very stable
Min FPS: 10.9 (significantly better than YOLOv5n's 1.5)
Max FPS: 54.0
Tight FPS range indicates predictable, reliable performance
Objects Detected per Frame (Bottom Left):
Mean: 0.06 objects per frame
Similar detection pattern to YOLOv5n
3 major detection events aligned with obstacle encounters
Top 10 Detected Classes (Bottom Right):
Car: 133 detections (+4 more than YOLOv5n)
Unique Classes: 1 (perfect lane filtering)
Slightly better detection count in identical scenario

Statistical Analysis: YOLOv8n Clear Conditions
Inference Time Distribution (Top Left):
Very tight distribution centered on 20ms
Clean, concentrated histogram
No secondary peaks or outliers
Detections vs Processing Time (Top Right):
Similar flat trend line: y=0.44x+19.92
Slightly steeper than YOLOv5n but still minimal correlation
No outliers in scatter plot
Cumulative Detections (Bottom Left):
Total: 133 cars detected
Nearly identical accumulation pattern to YOLOv5n
Same plateau regions during no-detection periods
Performance Summary (Bottom Right):
Mean inference: 19.95ms (slightly faster than v5n)
Std Dev: 1.18ms (much lower than v5n's 13.56ms)
Max: 67.88ms (vs v5n's 673.90ms) - 10× more stable
Detection rate: 5.7%
Key Findings: Clear Conditions
YOLOv8n demonstrates:
Marginally faster inference (19.8ms vs 20.5ms)
4 additional car detections (133 vs 129)
Dramatically superior stability - no catastrophic slowdowns
10× better worst-case performance (67.88ms vs 673.90ms)
The critical difference is reliability. YOLOv5n's 673ms spike would create a 6.7-meter blind spot at 10 m/s vehicle speed, which is unacceptable for collision avoidance. YOLOv8n's worst case of 67.88ms creates only a 0.68-meter blind spot under identical conditions.
Results: Extreme Rain Conditions (Low-Light)
Performance Comparison
Metric | YOLOv5n | YOLOv8n | Difference |
Car Detections | 28 | 21 | +33.3% |
Total Objects | 61 | 104 | -41.3% |
Mean Inference Time | 19.9ms | 19.7ms | -1.0% (faster) |
Median Inference Time | 19.8ms | 19.5ms | -1.5% (faster) |
Mean FPS | 50.3 | 51.0 | +1.4% |
Min FPS | 14.5 | 17.7 | +22% |
Max FPS | 53.2 | 54.5 | +2.4% |
Max Inference Time | 68.93ms | 56.49ms | -18% (stability) |
Detection Rate | 4.6% | 8.1% | +76% (all classes) |
Unique Classes | 5 | 7 | +2 |

Graph Analysis: YOLOv5n Rain Conditions
Inference Time per Frame (Top Left):
Mean: 19.9ms, Median: 19.8ms
Stable baseline around 20ms
Peak spike reaches 69ms
Much better stability than clear conditions (no 600ms+ spikes)
FPS Over Time (Top Right):
Mean: 50.3 FPS
Min: 14.5 FPS (corresponding to 69ms spike)
Generally stable performance in rain
Tight FPS band around 48-52 range
Objects Detected per Frame (Bottom Left):
Mean: 0.05 objects per frame
Sparse detection pattern
One spike reaching 2 objects simultaneously
Lower detection activity than clear conditions
Top 10 Detected Classes (Bottom Right):
Car: 28 detections - the primary target
Airplane: 19 detections
Train: 11 detections
Boat: 2, Skateboard: 1
Unique Classes: 5 (lane filtering less effective in rain)

Statistical Analysis: YOLOv5n Rain Conditions
Inference Time Distribution (Top Left):
Tight concentration around 20ms
Clean distribution with minimal spread
No significant outliers in histogram
Detections vs Processing Time (Top Right):
Flat trend line: y=0.36x+19.92
No correlation between object count and inference time
Sparse scatter pattern due to low detection frequency
Cumulative Detections (Bottom Left):
Total: 61 objects (all classes)
28 cars specifically (from class distribution)
Slower accumulation than clear conditions
Stepped pattern shows clustered detection events
Performance Summary (Bottom Right):
Mean inference: 19.94ms
Std Dev: 1.55ms (low variability)
Max: 68.93ms (acceptable for real-time)
Detection rate: 4.6%

Graph Analysis: YOLOv8n Rain Conditions
Inference Time per Frame (Top Left):
Mean: 19.7ms, Median: 19.5ms
Stable 20ms baseline
Several spikes visible reaching 22-23ms
One peak around 56ms
FPS Over Time (Top Right):
Mean: 51.0 FPS
Min: 17.7 FPS (better than YOLOv5n's 14.5)
Consistent performance around 50-52 FPS
Slightly tighter stability band than YOLOv5n
Objects Detected per Frame (Bottom Left):
Mean: 0.09 objects per frame (higher than YOLOv5n)
Multiple detection spikes reaching 2 objects
More frequent detection activity
Top 10 Detected Classes (Bottom Right):
Car: 21 detections (-25% vs YOLOv5n)
Airplane: 49 detections
Baseball bat: 24 detections
Skateboard: 4, Truck: 3, Frisbee: 2, Bus: 1
Unique Classes: 7 (more false positives in rain)

Statistical Analysis: YOLOv8n Rain Conditions
Inference Time Distribution (Top Left):
Sharp peak around 19-20ms
Very concentrated distribution
Minimal variance from mean
Detections vs Processing Time (Top Right):
Minimal trend: y=0.29x+19.64
Flat scatter indicating no complexity-speed relationship
Consistent inference regardless of detection count
Cumulative Detections (Bottom Left):
Total: 104 objects (all classes)
21 cars specifically (from class distribution)
Faster accumulation rate than YOLOv5n for total objects
But fewer cars specifically
Performance Summary (Bottom Right):
Mean inference: 19.66ms (marginally faster than v5n)
Std Dev: 1.23ms (lower than v5n's 1.55ms)
Max: 56.49ms (better than v5n's 68.93ms)
Detection rate: 8.1% (all classes)
Key Findings: Rain Conditions
YOLOv5n demonstrates:
33% more car detections (28 vs 21) - significant advantage
Fewer total objects detected (61 vs 104)
Slightly more stable in this specific scenario
Better focus on collision-relevant targets in adverse weather
YOLOv8n demonstrates:
More total detections but fewer cars specifically
Slightly faster inference (19.7ms vs 19.9ms)
Better worst-case latency (56.49ms vs 68.93ms)
More false positives (airplanes, baseball bats) in rain
The rain results reveal an interesting trade-off: YOLOv5n's anchor-based design appears better calibrated for detecting cars in degraded visibility, while YOLOv8n detects more objects overall but with lower precision on the collision target.
Overall Comparison Summary
Combined Detection Performance
Condition | YOLOv5n Cars | YOLOv8n Cars | Difference |
Clear (2331 frames) | 129 | 133 | YOLOv8n +3.1% |
Rain (1185 frames) | 28 | 21 | YOLOv5n +33.3% |
Combined Total | 157 | 154 | YOLOv5n +1.9% |
YOLOv5n edges out YOLOv8n in total car detections (+3 cars across 3516 frames), driven entirely by superior rain performance.
Speed and Stability
Metric | YOLOv5n | YOLOv8n | Winner |
Mean Inference (Clear) | 20.5ms | 19.8ms | YOLOv8n |
Mean Inference (Rain) | 19.9ms | 19.7ms | YOLOv8n |
Worst Case (Clear) | 673.90ms | 67.88ms | YOLOv8n (10× better) |
Worst Case (Rain) | 68.93ms | 56.49ms | YOLOv8n |
Min FPS (Clear) | 1.5 | 10.9 | YOLOv8n |
Min FPS (Rain) | 14.5 | 17.7 | YOLOv8n |
YOLOv8n is consistently faster and dramatically more stable, especially in clear conditions.
The Critical Trade-Off
This comparison reveals a fundamental trade-off between two important qualities:
YOLOv5n strengths:
+33% better car detection in rain (28 vs 21)
Slightly higher total car count overall (+3 across both scenarios)
Proven maturity (4 years production deployment)
YOLOv5n weaknesses:
673ms catastrophic spike in clear conditions (1.5 FPS)
Unpredictable stability - cannot guarantee sub-100ms response
This single failure mode disqualifies it for safety-critical systems
YOLOv8n strengths:
Consistent sub-70ms worst-case latency (67.88ms clear, 56.49ms rain)
10× more stable than YOLOv5n in clear conditions
Faster average inference across both scenarios
More total object detections (but lower precision on cars in rain)
YOLOv8n weaknesses:
25% fewer car detections in rain vs YOLOv5n
Slightly lower total car count overall (-3 across both scenarios)
More false positives in adverse weather
Decision Justification
For safety-critical real-time collision avoidance, YOLOv8n is selected despite YOLOv5n's superior rain detection performance.
Reasoning:
At 10 m/s vehicle speed, latency directly translates to blind distance:
YOLOv5n worst case: 673ms = 6.7-meter blind spot
YOLOv8n worst case: 67.88ms = 0.68-meter blind spot
A 6.7-meter blind spot is catastrophic for collision avoidance. Even if YOLOv5n detects 33% more cars in rain when functioning normally, the system cannot tolerate unpredictable 673ms stalls.
The 25% reduction in rain detection accuracy (21 vs 28 cars) is acceptable because:
Multi-sensor fusion provides redundancy (LiDAR + radar compensate)
Consistent 20ms latency enables predictable control response
Total detection count remains sufficient for obstacle awareness
Implications for Sensor Fusion
Both models show significant detection degradation in adverse weather:
YOLOv5n: 129 cars (clear) → 28 cars (rain) = 78% drop
YOLOv8n: 133 cars (clear) → 21 cars (rain) = 84% drop
This validates the core project hypothesis: camera-based perception alone is insufficient. LiDAR and radar provide critical redundancy when vision degrades.
Even the better-performing YOLOv5n loses nearly 80% of its detection capability in rain, demonstrating why multi-sensor fusion is essential for reliable autonomous collision prevention.
Outcome
This generational comparison demonstrates:
YOLOv8n provides:
3.4% faster average inference in good conditions
10× better worst-case stability (critical for safety)
Acceptable detection performance despite 25% rain deficit
More predictable, reliable real-time behavior
YOLOv5n provides:
33% better car detection in adverse weather
Marginal total detection advantage (+3 cars)
Proven production maturity
Catastrophic failure mode that disqualifies it for safety-critical use
Selection: YOLOv8n chosen for Phase 2 hardware deployment due to superior stability and consistent sub-70ms latency, despite YOLOv5n's rain detection advantage. Multi-sensor fusion compensates for camera vision limitations in adverse weather.
Key lesson: For safety-critical systems, worst-case behavior matters more than average performance. YOLOv5n's superior rain detection cannot compensate for unpredictable 673ms stalls that would create multi-meter blind spots during obstacle approach.
Complete performance data, graphs, and CARLA scenario videos available in project repository for reproducibility.
Github:



Comments