Skip to main content

Troubleshooting

Common issues encountered when deploying the monitoring stack and how to solve them.

πŸ›‘ "NVIDIA CUDA: NO" in Logs​

Symptoms:

  • camera-manager logs show NVIDIA CUDA: NO during startup.
  • Log error: Could not find decoder 'h264_cuvid'.
  • High CPU usage (ffmpeg falling back to software decoding).

Cause: The ultralytics package (YOLO) declares opencv-python (CPU version) as a dependency. Pip installs it automatically, overwriting the custom header-only OpenCV (CUDA version) provided by the base image.

Solution: You must explicitly uninstall the standard package in the Dockerfile:

# correct sequence
RUN pip install -r requirements.cuda.txt && \
pip uninstall -y opencv-python opencv-python-headless

And Rebuild the image: docker-compose up -d --build --force-recreate.

πŸ“‰ Grafana: "Datasource not found"​

Symptoms:

  • Dashboards show red error boxes.
  • Error message: Datasource ${DS_PROMETHEUS} was not found.

Cause: This happens when importing dashboards exported from other systems that use Templating Variables for datasources, but your Grafana instance expects a hardcoded UID (e.g., prometheus).

Solution:

  1. Use hardcoded UID: In datasource.yml, ensure uid: prometheus.
  2. Clean JSON: Replace "${DS_PROMETHEUS}" with "prometheus" in all dashboard JSON files.
  3. Delete Cache: Grafana stores dashboards in its SQLite DB. If you updated the JSON file but see no change, Grafana is serving the cached version.
    • Fix: Set allowUiUpdates: false in dashboard.yml to force file-based loading, or delete the grafana-data volume.

🌐 "Server Misbehaving" (DNS Error)​

Symptoms:

  • Prometheus Targets page shows down for monitoring-cadvisor or monitoring-node-exporter.
  • Error: server returned HTTP status 503 or dial tcp: lookup monitoring-cadvisor on ...: no such host.

Cause: The containers are likely not on the same Docker network (monitor-net).

Solution: Check docker network inspect monitor-net. All 5 monitoring containers AND the camera-manager must be listed in the Containers section. If not, verify docker-compose.yml network definitions.