TECHNOLOGY · PERCEPTION & HD MAP

Camera-first perception, map-grounded reasoning.

A YOLOv8n detector that fits in under 12 MB, a Lanelet2 HD-map localizer that turns GPS and wheel odometry into lane-accurate pose, and a map-aware motion predictor that actually understands the road geometry your ego car is sitting on.

Three layers turn pixels into prediction

Each is independently swappable. Bring your own perception, keep our HD map. Bring your own map, keep our prediction. Composable by design.

LAYER 1 · DETECTION

YOLOv8n (ONNX)

ONNX runtime, CPU execution, sub-12 MB quantized weights. Classes: vehicle, pedestrian, cyclist, traffic sign. Bring-your-own detector supported, we consume tracks, not pixels, at the field boundary.

  • input: 640 × 480 @ 20 FPS
  • conf_threshold: 0.4
  • target: RPi Camera Module 3
LAYER 2 · LOCALIZATION

Lanelet2 HD map

Native Lanelet2 OSM XML importer. Lanelets, crosswalks, stop lines, ODD polygons. Route planner with A* across the lanelet graph. Localization projects GPS + odometry onto the nearest lanelet.

  • import_osm_xml()
  • plan_route(start, goal)
  • Frenet (s, d) frame available
LAYER 3 · PREDICTION

Map-aware CV + CTRV

Constant-velocity and constant-turn-rate kinematic models, lane-snapped via Frenet projection when the agent is on a mapped lanelet. 1.5 s horizon, confidence half-life decay, weight-bucketed seeding.

  • horizon_s: 1.5
  • n_steps: 4
  • confidence_half_life_s: 1.2
WHY IT MATTERS

Map-aware prediction ≠ freeze-frame

Most camera-only stacks treat detected agents as static obstacles, re-detected every frame. That's fine at 5 m/s in a parking lot, catastrophic at 20 m/s on a highway. We project each track into the Frenet frame of its lanelet and predict forward along the road.

  • A car in a left-turn lane is predicted to turn left, not to plough through the median.
  • A pedestrian at a crosswalk stays on the crosswalk until they reach the other side.
  • Oncoming traffic stays in its lane unless the HD map says a lane change is legal.
  • Agents with no lanelet match fall back to unsnapped constant-velocity, safe default, not a silent failure.
drive_demo.rs
[map]     2 lanelets, 1 ODD polygon loaded
[map]     route 1→4: [1, 2, 3, 4] (cost=48.0 m)
[tracks]  3 agents: lead-car,
          oncoming-car, pedestrian

[predict] horizon=1.5s, 4 steps
  track  1 (class 2): t=+1.50s
                     → (+0.0, +30.0) conf=0.47
  track  2 (class 2): t=+1.50s
                     → (-3.5, +7.0)  conf=0.47
  track  3 (class 0): t=+1.50s
                     → (+0.8, +35.0) conf=0.47

[seeds]   3 weighted PDE seeds → 1 bucket

Camera-first, not camera-only

The reference stack ships with camera + GPS + wheel odometry. Radar and LiDAR are optional and compose by fusion at the track level. The PDE field doesn't care where the track came from.

Camera

Required

Mono RPi Cam 3 or equivalent. HFOV 62.2°.

GPS

Required

u-blox F9P or equivalent. Any NMEA feed works.

Wheel odom

Required

CAN frame on speed_feedback_id. 20 Hz default.

Radar

Optional

Continental ARS, Aptiv ESR, or mmWave on ROS 2.

Ship the map. Ship the predictor. Keep your sensors.

We integrate at the track level, not the pixel level. Drop us into an existing Tier-1 perception stack in a weekend.