Camera Media Pipeline
Camera Media Pipeline
The sorter treats camera video as two separate products:
- Computer-vision frames: Python/OpenCV-owned, full quality, used for detection, tracking, samples, calibration, and still captures.
- Browser live preview: WebRTC transport, hardware encoded where possible, optimized for low latency and stable inspection quality.
Python should stay the control plane. It owns camera assignment, capture mode, picture settings, detection orchestration, incidents, sample capture, and metadata. It should not be the long-term 4K live-video encoder.
Target Architecture
Camera
-> Python/OpenCV CaptureThread
-> detection / tracking / samples / high-quality stills
-> metadata events
Camera
-> platform media pipeline
-> hardware H.264 encoder
-> WebRTC browser stream
Browser
-> video element
-> metadata overlay canvas/SVG for boxes, zones, incidents, telemetry
The browser overlay is deliberately not burned into the video. That keeps the video encoder focused on the image and lets the UI render boxes/zones crisply at display resolution.
Platform Targets
macOS
Preferred stack:
avfvideosrc -> videoconvert -> vtenc_h264_hw -> h264parse -> WebRTC handoff
The encoder target is Apple VideoToolbox through GStreamer.
Orange Pi 5 / RK3588
Preferred stack:
v4l2src -> videoconvert/RGA -> mpph264enc or rkv4l2h264enc -> h264parse -> WebRTC handoff
The encoder target is the Rockchip media pipeline. Element names vary by image,
so the planner accepts mpph264enc, rkv4l2h264enc, or v4l2h264enc.
Current Implementation Marker
GET /api/cameras/media-pipeline reports the desired backend per camera role.
It returns:
- selected backend:
gstreamer_hardwareor the currentpython_aiortcfallback - required tools and GStreamer elements
- the planned GStreamer pipeline stage
- why the system is falling back
This endpoint is the migration boundary: UI and runtime code should depend on this capability state instead of hardcoding transport decisions across components.
Migration Rules
- Prefer lowering preview FPS over lowering capture resolution.
- Keep full-resolution still capture available even when live preview is reduced.
- Do not duplicate camera opens for normal operation.
- Do not burn overlays into the long-term live-video stream.
- Keep Python aiortc as a fallback, not as the final 4K preview foundation.