Who is Qalab Hassnain Agha?

Qalab Hassnain Agha (QHA) is a CTO and AI Systems Architect based in Islamabad, Pakistan. He leads Quickgen Technologies and QuickComm AE, with 4+ years building production AI systems including LLM pipelines, computer vision, IoT platforms, and cloud-native backends shipped to clients in Australia, UAE, the UK, and Pakistan.

What AI services does Qalab Hassnain Agha offer?

Qalab offers AI Systems Architecture & Consulting, LLM Pipeline and RAG development (GPT-4, Gemini, Claude, Whisper), Computer Vision systems (YOLOv8, OpenCV), Backend development (FastAPI, microservices, AWS/Azure), and IoT platform development (BLE 5.0, ESP32, MQTT).

What is Qalab Hassnain Agha's tech stack?

Primary stack: Python, FastAPI, TensorFlow, Keras, YOLOv8, OpenCV, LLMs (GPT-4, Gemini, Claude), AWS, Azure, Docker, PostgreSQL, Redis, WebSockets. Also works with Next.js, Flutter, .NET Core, and IoT (BLE 5.0, ESP32, MQTT).

Where is Qalab Hassnain Agha based and does he work remotely?

Qalab is based in Islamabad, Pakistan and works remotely with international clients. He has delivered projects for clients in Australia, UAE, the UK, and Pakistan, and is open to remote, hybrid, or relocation opportunities.

How can I hire Qalab Hassnain Agha for an AI project?

You can contact Qalab via email at aghaqalabhassnain@gmail.com, book a 30-minute call on Calendly, or reach him on LinkedIn (linkedin.com/in/qalabhassnainagha) and Upwork. He is currently available for new projects and consultations.

Computer VisionYOLOv8Object DetectionReal-Time SystemsProduction AI

YOLOv8 in Production: Building a Multi-Camera CCTV Anomaly Detection System

Qalab Hassnain Agha·August 14, 2025·14 min read

ShareLinkedIn X / Twitter WhatsApp

YOLOv8 achieves state-of-the-art object detection benchmarks on academic datasets. That's well documented. What's less documented is what happens when you deploy YOLOv8 to process 8 simultaneous CCTV feeds in real time, detect anomalies across zone-based business rules, deliver WebSocket alerts to a security dashboard under 200ms, and keep false positives low enough that security staff don't start ignoring the alerts.

I built a multi-camera CCTV anomaly detection system through four evolutionary phases — from a single-camera prototype to a production system processing 8 simultaneous feeds at 91% detection accuracy.

What Anomaly Detection Actually Means in This Context

For this system, anomaly detection means detecting specific pre-defined behavioral patterns that violate business rules:

Unauthorized access to restricted zones
Loitering beyond a configurable time threshold
Crowd density exceeding zone-specific limits
Object left unattended beyond threshold duration
People count falling below minimum staffing levels in service areas

These are violations of explicit rules applied to detected objects in defined zones. YOLOv8 provides the object detection foundation. The business logic layer above it defines what constitutes an anomaly.

Phase 1: Single Camera Prototype

Model selection

YOLOv8 comes in five sizes: nano (n), small (s), medium (m), large (l), and extra-large (x). For real-time CCTV processing on an NVIDIA T4 GPU:

YOLOv8n: 45 FPS on T4, 78% mAP on our evaluation set
YOLOv8s: 28 FPS on T4, 86% mAP on our evaluation set

YOLOv8s with 8 cameras = 3.5 FPS per camera. YOLOv8n with 8 cameras = 5.6 FPS per camera. Neither was enough. After INT8 quantization, YOLOv8s reached 67 FPS on T4 — 8.4 FPS per camera, acceptable for this use case since behavioral anomalies unfold over seconds.

RTSP stream handling

RTSP is the standard for CCTV cameras. Handling RTSP streams reliably requires explicit reconnection logic — cameras go offline, network connections drop. We wrap each stream in a thread that monitors connection health and reconnects with exponential backoff. Camera status is tracked separately from detection.

Phase 2: Multi-Camera Architecture

Frame batching across cameras

Processing each camera in an independent thread with its own model inference wastes GPU resources. Instead, we collect one frame from each active camera, batch them into a single tensor, and run a single batched inference call. GPU parallelism makes this nearly free — 67 FPS effectively applies to all 8 cameras combined.

Zone definition

Each camera has configurable detection zones — polygonal regions defined in pixel coordinates, stored in the database and loaded at startup. Changing zone boundaries requires no code changes or redeployment. For each detected object, we calculate zone occupancy using point-in-polygon testing.

Object tracking

To apply time-based rules (loitering threshold, unattended object duration), we need persistent object identities across frames. We use ByteTracker — a lightweight multi-object tracker that assigns stable IDs across frames even through brief occlusions. Each tracked object maintains: track ID, first/last detection timestamp, current zone, and detection history.

Phase 3: Business Rules Engine and Alert Generation

Rules are defined in YAML rather than code. Each rule specifies camera ID, zone, object class, condition type, threshold, severity, and optional time windows.

rules:
  - name: "loitering_restricted_zone"
    camera_id: "cam_02"
    zone: "server_room_entrance"
    object_class: "person"
    condition: "duration_in_zone"
    threshold_seconds: 30
    alert_severity: "high"

  - name: "low_staffing_checkout"
    camera_id: "cam_05"
    zone: "checkout_area"
    object_class: "person"
    condition: "count_below"
    threshold_count: 2
    alert_severity: "medium"
    time_window: "business_hours"

False positive reduction

Raw detection results from YOLOv8 are noisy. We apply two filters before generating alerts:

Temporal debouncing: a rule must be continuously triggered for N frames before generating an alert — brief triggers are filtered as noise
Confidence thresholding: detections below 0.6 confidence are excluded from rule evaluation

Phase 4: Real-Time Dashboard and Alert Delivery

Alert delivery architecture

The rules engine publishes alert events to a Redis channel. A dedicated alert delivery service subscribes and pushes events to connected WebSocket clients in the appropriate security group. The pub-sub pattern decouples detection performance from delivery performance — a slow WebSocket client doesn't affect the detection pipeline.

Snapshot image handling

Each alert includes a snapshot of the triggering frame with the relevant zone and detected object highlighted, delivered within the 200ms budget:

Snapshot cropped and resized to dashboard display size at generation time, not delivery time
Compressed to JPEG quality 75 — readable for identification, fast to transfer
Uploaded to Azure Blob Storage asynchronously; alert delivered with a pre-signed URL
Dashboard loads image lazily — alert appears immediately, image loads as available

Production Numbers

Concurrent CCTV feeds: 8 simultaneous streams on a single NVIDIA T4
Detection accuracy: 91% mAP on the production evaluation set
False positive rate: 4% (down from 23% before debouncing and confidence thresholding)
Alert delivery latency: P95 < 180ms from triggering event to dashboard push
Frame processing rate: 8.4 FPS per camera (INT8 quantized YOLOv8s on T4)
Infrastructure cost: 60% lower than a single-camera-per-instance approach

Final Thoughts

Building a production CCTV anomaly detection system is substantially more complex than running YOLOv8 inference on video frames. The detection layer is the starting point, not the destination. The real engineering is in the tracking, the business rules engine, the false positive reduction, and the delivery infrastructure.

The false positive problem is the most underestimated challenge. A system that pages security staff 20 times per shift with false alerts trains them to ignore all alerts — including the real ones. Getting false positives below 5% required more engineering effort than everything else combined.

Solve the false positives first. Everything else is infrastructure.

Frequently Asked Questions

How many CCTV feeds can YOLOv8 process simultaneously in real time?

YOLOv8s quantized to INT8 achieves 67 FPS on an NVIDIA T4 GPU. With frame batching across cameras — collecting one frame per camera into a single batched inference call — this supports 8 simultaneous CCTV feeds at approximately 8.4 FPS per camera, sufficient for behavioral anomaly detection where events unfold over seconds.

How do you reduce false positives in AI-powered CCTV anomaly detection?

Apply two filters before generating alerts: temporal debouncing (require the rule to trigger for N consecutive frames before alerting, filtering brief false triggers as noise) and confidence thresholding (exclude YOLOv8 detections below 0.6 confidence). These two filters reduced our production false positive rate from 23% to 4%, which is the threshold where security staff begin to trust the system.

How do you track objects across video frames for time-based anomaly rules like loitering detection?

Use ByteTracker, a lightweight multi-object tracker that assigns stable IDs to detected objects across frames even through brief occlusions. Each tracked object maintains its track ID, first and last detection timestamps, current zone, and detection history — enabling rules like loitering detection (object in restricted zone for more than N seconds) and unattended object alerts.