Camera-first perception, map-grounded reasoning.
A YOLOv8n detector that fits in under 12 MB, a Lanelet2 HD-map localizer that turns GPS and wheel odometry into lane-accurate pose, and a map-aware motion predictor that actually understands the road geometry your ego car is sitting on.
Three layers turn pixels into prediction
Each is independently swappable. Bring your own perception, keep our HD map. Bring your own map, keep our prediction. Composable by design.
YOLOv8n (ONNX)
ONNX runtime, CPU execution, sub-12 MB quantized weights. Classes: vehicle, pedestrian, cyclist, traffic sign. Bring-your-own detector supported, we consume tracks, not pixels, at the field boundary.
- input: 640 × 480 @ 20 FPS
- conf_threshold: 0.4
- target: RPi Camera Module 3
Lanelet2 HD map
Native Lanelet2 OSM XML importer. Lanelets, crosswalks, stop lines, ODD polygons. Route planner with A* across the lanelet graph. Localization projects GPS + odometry onto the nearest lanelet.
- import_osm_xml()
- plan_route(start, goal)
- Frenet (s, d) frame available
Map-aware CV + CTRV
Constant-velocity and constant-turn-rate kinematic models, lane-snapped via Frenet projection when the agent is on a mapped lanelet. 1.5 s horizon, confidence half-life decay, weight-bucketed seeding.
- horizon_s: 1.5
- n_steps: 4
- confidence_half_life_s: 1.2
Map-aware prediction ≠ freeze-frame
Most camera-only stacks treat detected agents as static obstacles, re-detected every frame. That's fine at 5 m/s in a parking lot, catastrophic at 20 m/s on a highway. We project each track into the Frenet frame of its lanelet and predict forward along the road.
- ›A car in a left-turn lane is predicted to turn left, not to plough through the median.
- ›A pedestrian at a crosswalk stays on the crosswalk until they reach the other side.
- ›Oncoming traffic stays in its lane unless the HD map says a lane change is legal.
- ›Agents with no lanelet match fall back to unsnapped constant-velocity, safe default, not a silent failure.
[map] 2 lanelets, 1 ODD polygon loaded
[map] route 1→4: [1, 2, 3, 4] (cost=48.0 m)
[tracks] 3 agents: lead-car,
oncoming-car, pedestrian
[predict] horizon=1.5s, 4 steps
track 1 (class 2): t=+1.50s
→ (+0.0, +30.0) conf=0.47
track 2 (class 2): t=+1.50s
→ (-3.5, +7.0) conf=0.47
track 3 (class 0): t=+1.50s
→ (+0.8, +35.0) conf=0.47
[seeds] 3 weighted PDE seeds → 1 bucketCamera-first, not camera-only
The reference stack ships with camera + GPS + wheel odometry. Radar and LiDAR are optional and compose by fusion at the track level. The PDE field doesn't care where the track came from.
Camera
RequiredMono RPi Cam 3 or equivalent. HFOV 62.2°.
GPS
Requiredu-blox F9P or equivalent. Any NMEA feed works.
Wheel odom
RequiredCAN frame on speed_feedback_id. 20 Hz default.
Radar
OptionalContinental ARS, Aptiv ESR, or mmWave on ROS 2.
Ship the map. Ship the predictor. Keep your sensors.
We integrate at the track level, not the pixel level. Drop us into an existing Tier-1 perception stack in a weekend.