ποΈ Architecture: Distributed Batch Inference
The FirstBreath Vision System is designed to solve a critical scalability problem: Python threads cannot efficiently handle concurrent heavy AI inference loops due to the GIL (Global Interpreter Lock).
To bypass this, we implemented a Producer-Consumer pipeline using Redis as a high-speed broker.
The 3-Stage Pipelineβ
1. π· Camera Manager (The Producer)β
- Service:
services/camera-manager - Role: I/O Bound
- Logic:
- Connects to RTSP streams via OpenCV (Hardware Decoded).
- Resizes frames to model input size (640x640) immediately.
- Serializes frames (Binary/JPEG) and pushes them to the
batch_framesRedis List.
- Why?: Keeps the heavy I/O operations away from the GPU process. Can scale horizontally (multiple managers for hundreds of cameras).
2. π§ Batch Inference (The Worker)β
- Service:
services/batch-inference - Role: Compute Bound (GPU)
- Logic:
- Pulls N frames from Redis at once (Dynamic Baatching).
- Constructs a single Tensor
[BatchSize, 3, 640, 640]. - Runs inference once on the GPU.
- Splits results back by Camera ID.
- Pushes raw bounding boxes to
batch_results.
- Performance: Increases throughput by ~400% compared to sequential processing.
3. βοΈ Redis Worker (The Logic)β
- Service:
services/redis-worker - Role: CPU / Logic Bound
- Logic:
- Consumes detection results.
- Post-processing: Non-Maximum Suppression (NMS), filtering low confidence.
- Business Logic: "Is the horse down?", "Is it moving too fast?".
- Smoothing: Applies sliding window filters to prevent false positives.
- Persistence: Sends alerts to Backend.
Key Technologiesβ
π TensorRT & YOLOv11β
We use the ONNX Runtime (GPU) with TensorRT provider. The YOLOv11 model is exported with dynamic batch size support to allow processing anywhere from 1 to 32 cameras in a single pass.
β‘ Redis & Serializationβ
To achieve minimal latency (<50ms):
- Frames are encoded as JPEG (Quality 85) before transmission to reduce bandwidth.
- Redis is configured as an in-memory ephemeral store (no persistence for frame queues).