top of page
Search

YOLOv8 on Jetson CUDA: Trying to Fix the LiDAR Noise Problem With a Camera

  • Writer: Raffay Hassan
    Raffay Hassan
  • Mar 3
  • 5 min read

Updated: Mar 30

The LiDAR had a problem I couldn't ignore: it couldn't tell a real obstacle from a reflection. A shiny floor, a piece of glass, even the wrong angle of light all of them would produce a confident STOP alert from nothing. The persistence filter helped, but it didn't solve the root issue.


The solution was to add a camera and make the LiDAR and camera cross-validate each other. An obstacle only triggers an alert if both sensors agree something is there. The LiDAR provides range, the camera confirms the object actually exists. If LiDAR fires but the camera sees nothing, the alert gets suppressed as noise.

Everything in this post is still bench testing camera, LiDAR and radar all sitting on a desk. No hardware is on the RC car yet. Getting the full three-sensor pipeline running cleanly in software was the goal before any physical integration happens.


Why Cross-Validation Works


Each sensor has a specific failure mode on its own:

  • LiDAR on its own: excellent geometry, terrible at ignoring reflections and noise. One stray point in the wrong zone and you get a false STOP.

  • Camera on its own: can identify what's there, but has no idea how far away it is. A person at 0.5m and a person at 5m look the same to YOLO.


Together they cover for each other. The table of decisions:

LiDAR

Camera

Result

Sees object

Confirms object in zone

CONFIRMED trigger alert

Sees object

Sees nothing

REJECTED it's noise

Sees nothing

Sees object

CAUTION only no range data

Sees nothing

Sees nothing

SAFE


The Camera: Arducam IMX477


Image 1: Camera connected to Jetson
Image 1: Camera connected to Jetson

The camera is an Arducam IMX477 HQ (model B0249) the same Sony 12.3MP sensor that's in the Raspberry Pi High Quality Camera, adapted for Jetson with an automatic IR cut filter. It connects to the Jetson Orin Nano via the CSI port.


Getting the Driver Working on JetPack 6

The IMX477 isn't supported out of the box on JetPack 6. Arducam has an install script that handles it the critical detail is the -m imx477 flag. Without it you get the generic Jetvariety driver which doesn't work for this sensor.


The GStreamer Pipeline

This is where I spent most of my debugging time. The IMX477 outputs raw 10-bit Bayer (RG10 format) which you can't just open directly with OpenCV. It needs to go through NVIDIA's ISP pipeline first via nvarguscamerasrc. I also took the opportunity to downscale to 416x416 in hardware, which is the exact size YOLOv8n wants zero CPU resize cost.


The OpenCV Version Trap

Even with the pipeline right, the pip-installed opencv-python doesn't support NVMM buffers on Jetson. The system apt package does, but it lives in /usr/lib/python3.10 outside the venv. The fix was to find the right binary and symlink it in:


find /usr -name "cv2*.so" 2>/dev/null

# /usr/lib/python3.10/dist-packages/cv2/python-3.10/cv2.cpython-310-aarch64-linux-gnu.so


ln -s /usr/lib/python3.10/dist-packages/cv2/python-3.10/cv2.cpython-310-aarch64-linux-gnu.so \

/home/digit/robot_env/lib/python3.10/site-packages/cv2.so


python3 -c "import cv2; print(cv2.__version__)"

# 4.8.0


4.8.0 with GStreamer YES. That's what I want.


Getting CUDA on JetPack 6.2

The standard pip torch has no CUDA support on Jetson it's just CPU. NVIDIA builds a custom wheel for each JetPack version, and it has a dependency on cuSPARSELt that isn't included in the base JetPack install. Have to grab that separately first.


sudo dpkg -i cusparselt-local-tegra-repo-ubuntu2204-0.7.1_1.0-1_arm64.deb

sudo cp /var/cusparselt-local-tegra-repo-ubuntu2204-0.7.1/cusparselt-*-keyring.gpg /usr/share/keyrings/

sudo apt-get update && sudo apt-get install -y libcusparselt0 libcusparselt-dev


Then the actual PyTorch wheel for JetPack 6.1/6.2:


pip install torch-2.5.0a0+872d972e41.nv24.08.17622132-cp310-cp310-linux_aarch64.whl


The torchvision Nightmare

Getting YOLO to actually load was its own adventure. Ultralytics needs torchvision for Non-Maximum Suppression. But every torchvision binary from PyPI is compiled against a different PyTorch version and crashes immediately with operator torchvision::nms does not exist.

After trying several versions and watching them all fail in different ways, I ended up mocking the entire torchvision module in Python before ultralytics gets a chance to import it. The mock provides a pure-PyTorch NMS implementation and stubs out all the other submodules ultralytics touches.


import sys, types, torch, enum


def _nms(boxes, scores, iou_threshold):

if boxes.numel() == 0:

return torch.zeros(0, dtype=torch.long)

x1, y1, x2, y2 = boxes[:,0], boxes[:,1], boxes[:,2], boxes[:,3]

areas = (x2-x1) * (y2-y1)

order = scores.argsort(descending=True)

keep = []

while order.numel() > 0:

i = order[0].item(); keep.append(i)

if order.numel() == 1: break

rest = order[1:]

inter = (

(x2[rest].clamp(max=float(x2[i])) - x1[rest].clamp(min=float(x1[i]))).clamp(0) *

(y2[rest].clamp(max=float(y2[i])) - y1[rest].clamp(min=float(y1[i]))).clamp(0)

)

order = rest[inter / (areas[i] + areas[rest] - inter + 1e-6) <= iou_threshold]

return torch.tensor(keep, dtype=torch.long)


_tv = types.ModuleType('torchvision')

_tv.__version__ = '0.20.0'

# ops, transforms, models, datasets, utils, io all mocked similarly

sys.modules['torchvision'] = _tv

# now it's safe to import ultralytics

from ultralytics import YOLO


This runs at startup before ultralytics loads. It finds a complete torchvision in sys.modules and never tries to import the broken binary. Ugly but effective.


YOLO Running on the GPU

With all that sorted, YOLO loads on CUDA and inference is straightforward:


device = "cuda" if torch.cuda.is_available() else "cpu"

model = YOLO("yolov8n.pt")

model.to(device)

# [YOLO] Model loaded: yolov8n.pt on CUDA


The imgsz=416 matches the camera pipeline output exactly no internal resize happening inside YOLO either.

For the forward zone filter, only objects whose centre x falls within the middle 40% of the frame count as "directly ahead":


CENTRE_ZONE_FRAC = 0.40

cx_min = frame_width * (0.5 - CENTRE_ZONE_FRAC / 2)

cx_max = frame_width * (0.5 + CENTRE_ZONE_FRAC / 2)


Objects outside this zone are still drawn on the camera feed but don't contribute to collision decisions.


Keeping the Camera Feed Smooth

A few things made a noticeable difference for performance:

Hardware resize in the pipeline nvvidconv does the 1920x1080 to 416x416 downscale on the GPU, so OpenCV never handles the full resolution. That was the biggest single speedup.

YOLO runs every other frame the camera display updates every frame, but inference only happens every 2nd frame. The display feels smooth, the GPU load halves.

The leaky queue in GStreamer when the system is under load, old frames get dropped rather than queued. I got the latest frame or nothing, never a backlog of stale frames.


Where Things Stand


Image 2: Working system with live yolov8n running.
Image 2: Working system with live yolov8n running.

With everything running on the bench:

  • LiDAR: 38 points per frame, ±20° forward cone, 3-frame persistence filter

  • Radar: 41 tracks streaming from Pi, TTC per track

  • Camera: YOLOv8n on CUDA at 416x416, every 2nd frame

  • Fusion: LiDAR confirmed only when camera agrees

  • GUI: 9.5 FPS full update

The complete software stack is validated and working. The next step is physical integration mounting everything onto the RC car and doing real corridor tests. That's a future post.


A Few Things I'd Tell Myself at the Start

Sensor fusion is genuinely harder than it sounds, but also more powerful than any single sensor alone. Each sensor in isolation had real failure modes that made it unreliable on its own. Together, with cross-validation, most of those failure modes cancel out.

Embedded deployment on JetPack is always more work than expected. The custom PyTorch wheel, the GStreamer pipeline format, the OpenCV symlink, the torchvision mock none of this is documented in one place. Hopefully this series shortens that journey for someone else.


 
 
 

Comments


  • LinkedIn

The Burroughs, London

NW4 4BT

Autonomous Systems, Sensor Fusion, Digital Twins

 

© 2026 by Department of Science and Technology

 

bottom of page