Qalab Hassnain Agha — CTO & AI Systems Architect — Islamabad, Pakistan

QHA
000

Initialising Experience

CTO & AI Systems Architect

Qalab
Hassnain

Backend · Deep Learning · LLMs · Computer Vision · Data Science · IoT · Cloud

Available for work·Remote / Islamabad
scroll
Qalab Hassnain Agha

Qalab
Hassnain

CTO & AI Systems Architect

🇦🇺AU🇦🇪UAE🇬🇧UK🇵🇰PK
View Resume
4+Years in Production AI
15+Projects Delivered
5★Upwork Client Rating
Qalab Hassnain Agha
Qalab
Global
🇦🇺AU🇦🇪UAE🇬🇧UK🇵🇰PK

Education

MS Computer Science
Centre for Advanced Studies in Engineering (CASE)
B.E. Computer Engineering
National University of Sciences and Technology (NUST)

Who I Am

0+Years in Production AI
0AI Systems Built
0+Projects Delivered
0Upwork Rating

CTO at Quickgen Technologies, with 4+ years building production AI systems. I focus on deep learning, LLM pipelines, and computer vision. Additionally, I bring expertise in real-time data, MLOps, and cloud deployment on AWS and Azure. As a result, I've shipped 13+ products from idea to launch leading teams across healthcare, hospitality, fintech, and consumer tech. My technical foundations are backed by 16 certifications in machine learning, data science, and NLP.

Certifications

  • Neural Networks and Deep Learning — Coursera
  • AI for Medical Diagnosis — Coursera
  • Applied Data Science with Python Specialization — Coursera
  • Introduction to Data Science in Python — Coursera
  • Applied Machine Learning in Python — Coursera
  • Tools for Data Science — Coursera
  • Applied Text Mining in Python — Coursera
  • Open Source Tools for Data Science — Coursera
  • Applied Social Network Analysis in Python — Coursera
  • Data Science Orientation — Coursera
  • Applied Plotting, Charting & Data Representation in Python — Coursera
  • Deep Learning with Python — Udemy
  • Python 3.6 Complete Course — Udemy
  • Mastering Interview Skills — Udemy
  • Programming in C# — Udemy
  • Microsoft Office Specialist Word 2013 — Microsoft
Tech Stack79 technologies across all projects

AI, ML & Deep Learning

20 tools
  • Deep Learning
  • Neural Networks
  • TensorFlow
  • Keras
  • YOLOv8
  • OpenCV
  • LLMs
  • Whisper
  • Deepgram
  • Gemini API
  • GPT-4
  • RAG
  • NLP
  • Text Mining
  • LSTM
  • scikit-learn
  • ElevenLabs
  • Sentence Transformers
  • Replicate API
  • Prompt Engineering

Data Science & Analytics

11 tools
  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Jupyter
  • Data Visualisation
  • scikit-learn
  • ChromaDB
  • Vector DBs
  • Model Fine-tuning

Backend & APIs

12 tools
  • FastAPI
  • Flask
  • Django
  • .NET Core
  • C#
  • Node.js
  • WebSockets
  • REST APIs
  • Microservices
  • gRPC
  • MQTT
  • UDP

Cloud & Infrastructure

9 tools
  • AWS
  • GCP
  • Azure
  • Vercel
  • Docker
  • Kubernetes
  • CI/CD
  • Redis
  • Nginx

Databases & BaaS

7 tools
  • PostgreSQL
  • SQL Server
  • MongoDB
  • Firebase
  • Supabase
  • ChromaDB
  • Vector DBs

Observability & DevTools

7 tools
  • Grafana
  • Sentry
  • Glitchtip
  • Prometheus
  • Docker Compose
  • GitHub Actions
  • Postman

Frontend & Mobile

6 tools
  • React
  • Next.js
  • Flutter
  • React Native
  • Tailwind
  • TypeScript

IoT & Hardware

7 tools
  • BLE 5.0
  • ESP32
  • MQTT
  • Edge AI
  • PCM Audio
  • FFmpeg
  • 200Hz+ Streaming
01 / 13
upLYFT
01
Healthcare · IoT · SaaS

upLYFT

Physical Rehabilitation Platform

View Project

Two-sided physiotherapy SaaS on Azure. BLE wearables stream 200Hz+ sensor data to a cloud ML pipeline computing joint angles, gait symmetry, power, and load in real time. Clinician, athlete, and admin dashboards with sub-100ms latency supporting 500+ concurrent sessions.

FastAPIFlutterBLE 5.0AzurePostgreSQLWebSocketsAI/MLDocker
92%+ movement classification accuracy500+ concurrent sessionsSub-100ms end-to-end latency
Overview
02
Overview
LLM · Real-Time · Hospitality

QuickComm

AI Hospitality Communication Engine

View Project

Real-time AI communication backbone replacing walkie-talkies in hotels. PCM audio ingested from radios, transcribed at 94%+ accuracy via Whisper/DeepGram, classified by Gemini LLM at ~88% intent precision, and routed to staff teams instantly. Monolith-to-microservices migration delivered 3x throughput at ~$3/month per property.

FastAPIGemini APIDeepgramWhisperRedisWebSocketsAWSMicroservices
94%+ transcription accuracy~45% faster staff response3x throughput post-migration
AI Pendant
03
IoT · Wearable · EdTech

AI Pendant

Smart Wearable for Kids

NDA / Private

IoT pendant wearable for children with 160+ API endpoints. Device pairing, Pomodoro timers, missions, mood tracking, sleep monitoring, baby-cry detection, rewards and badges, parental dashboard, educational content, and anti-loss alerts. Gemini-powered emotional insights and activity analytics.

FastAPIPostgreSQLGemini APIFirebaseSupabaseBLEPython
160+ API endpoints5+ health tracking modesGemini AI emotional insights
The Giving Cube
04
FinTech · IoT · Mobile

The Giving Cube

IoT Donation Ecosystem

NDA / Private

Complete IoT charity platform connecting a smart donation box, mobile app, and web dashboard. Physical coin/note donations auto-detected by the smart cube, processed via Stripe, and synced to charity dashboards in real time. BLE/WiFi dual-mode with 99.9% transaction reliability.

FastAPIStripePostgreSQLBLEReact NativeFlaskSupabase
£50K+ donations processed99.9% transaction reliabilityBLE/WiFi real-time sync
Interior Design AI
05
AI Vision · Generative · Product

Interior Design AI

AI-Powered Room Styling

NDA / Private

Photorealistic AI pipeline for furniture placement in room photos. Three-stage pipeline: item gallery generation → AI mood board with Gemini → composite scene rendering using depth-aware perspective scaling. Painter's algorithm compositing places furniture at accurate scale with shadow blending.

PythonPillowOpenCVReplicate APIGoogle Gemininumpy
3-stage AI pipelineDepth-aware perspective scalingPhotorealistic compositing
06
Early Test
Computer Vision · AI · Security

CCTV Anomaly AI

Real-Time Surveillance Intelligence

NDA / Private

Multi-phase AI surveillance system evolved from prototype to production MVP. YOLOv8 detects crowd surges, loitering, intrusion, and fights across 8+ simultaneous camera feeds. WebSocket streams push annotated alerts to operator dashboards with configurable per-zone thresholds.

YOLOv8OpenCVFastAPIPythonPostgreSQLWebSocketsDockerFFmpeg
91% detection accuracy8+ simultaneous camera feeds35% fewer false positives
Apollo Golf GPS
07
Mobile · GPS · Sports Tech

Apollo Golf GPS

Smart Caddie Companion App

NDA / Private

GPS-based golf companion app replacing rangefinders for 1,000+ active users. Backend handles ball tracking, scorecard management, and shot analytics. Optimised PostgreSQL schema delivers course data in under 80ms. Integrated real-time GPS with ±2m accuracy for precise on-course navigation.

FlutterFastAPIGoogle Maps APIPostgreSQLSupabasePython
1,000+ active users<80ms course data queries±2m GPS accuracy
System Architecture
08
System Architecture
Enterprise · Digital Twin · SaaS

MRO Digital Twin

Automotive Workshop Management

NDA / Private

Full-stack digital twin for automotive MRO workshops. Seven role-based dashboards (manager, reception, technician, store, QC, QA, supervisor) covering the complete vehicle lifecycle — intake → work order → stage checklists → spare GRN → QC → delivery. 34+ database tables, barcode/QR integration, real-time KPI dashboards.

FastAPIReactPostgreSQLSQLAlchemyDockerTailwindQR Codes
7 role-based dashboards34+ DB tablesFull vehicle lifecycle tracking
System Architecture
09
System Architecture
AI · LLM · Research

PaperIntel

RAG Research Intelligence

NDA / Private

End-to-end Retrieval-Augmented Generation system for research papers. PDF ingestion, chunking, BGE embedding, ChromaDB vector store, cross-encoder reranking, and citation-aware generation. Supports similarity, MMR, and hybrid BM25+vector retrieval with query expansion and multi-hop decomposition.

FastAPIChromaDBSentence TransformersPyMuPDFNext.jsBM25OpenAI
Hybrid BM25 + vector retrievalCitation-aware generationMulti-hop query decomposition
System Architecture
10
System Architecture
LLM · Real-Time · Audio

Real-Time Translation

Multilingual Audio AI Platform

NDA / Private

Real-time translation backend capturing live PCM audio, transcribing and translating into 10+ languages in under 800ms end-to-end. Achieved 96%+ transcription accuracy using Whisper with custom language model fine-tuning for domain-specific vocabulary.

FastAPIWhisperPythonWebSocketsRedisDockerAWS
<800ms end-to-end latency96%+ transcription accuracy10+ languages supported
System Architecture
11
System Architecture
IoT · AI · Sports Tech

Smart Boxing Gloves

AI Combat Sports Analytics

NDA / Private

Real-time AI system embedded in boxing gloves tracking speed, force, calorie burn, and motion patterns. Hardware sensors stream data to cloud at 200Hz+; ML models analyse punch mechanics and training load. Data-to-insight latency reduced to under 50ms with hardware-cloud integration.

PythonFastAPIWebSocketsESP32BLEMQTTReactPostgreSQL
<50ms data-to-insight latency200Hz+ sensor streamingSpeed · Force · Calorie analytics
System Architecture
12
System Architecture
LLM · Voice · AI Assistant

BadarAI Assistant

Conversational Voice AI

NDA / Private

Full-duplex voice AI assistant with streaming STT, LLM reasoning, and TTS synthesis. Deepgram streams speech-to-text in real time; Gemini handles multi-turn dialogue with persistent memory; ElevenLabs renders natural-voice responses under 800ms. Supports persona customisation and live tool-calling for real-time data fetching.

FastAPIGemini APIDeepgramElevenLabsWebSocketsRedisReactPython
<800ms end-to-end latencyMulti-turn memoryLive tool-calling support
System Architecture
13
System Architecture
NLP · AI · Fraud Detection

Fake Ad Detection

NLP Fraud Classification System

NDA / Private

NLP fraud identification system classifying 50,000+ online advertisements. Ensemble approach combining LSTM, SVM, Doc2Vec, and TF-IDF features achieves 94% classification accuracy with 40% faster inference compared to baseline models. Deployed as a FastAPI microservice with real-time scoring.

PythonLSTMSVMDoc2VecTF-IDFscikit-learnFastAPI
94% classification accuracy50,000+ ads classified40% faster inference vs baseline
Writing

Production AI Insights

View all 8 articles
Available for project-based consultationsAI systems, computer vision, LLMs, production architecture
Book a Call
Experience6 positions · 🌍 4 countries
01
🇦🇪Dubai, UAE
PresentFull-time
QuickComm AE

Chief Technology Officer

  • Architected a real-time audio communication system replacing walkie-talkies in hotels, cutting staff response time by ~45%.
  • Built PCM audio ingestion pipeline with Whisper & DeepGram achieving 94%+ transcription accuracy and ~88% intent precision via Gemini LLM.
FastAPIGemini APIDeepgramWhisperAWS
02
🇬🇧London, UK
PresentContract
upLYFT

IoT & Full Stack Developer

  • Deployed complete backend & web-app for a two-sided physiotherapy platform, reducing clinician onboarding time by ~60%.
  • Integrated ML-based kinematics pipeline — gait analysis and kinetics models achieving 92%+ movement classification accuracy from IoT wearable sensors.
FastAPIFlutterBLE 5.0AzureSQL Server
03
🇦🇺🇦🇪🇵🇰AU · UAE · PK
PresentFull-time
Quickgen Technologies

Chief Technology Officer

  • Leading technical strategy and end-to-end delivery across AI, IoT, and SaaS products in healthcare, hospitality, fintech, and consumer tech.
  • Shipped CCTV anomaly detection system (YOLOv8) with 91% accuracy across 8+ simultaneous camera feeds, reducing false alerts by 35%.
PythonFastAPIYOLOv8OpenCVAWS
04
🌍International Remote
Freelance

Freelance (Upwork & Direct)

AI Engineer & Backend Developer

  • Delivered 15+ AI and backend solutions for international clients — computer vision (OCR, pose estimation, object detection), NLP automation pipelines, and full-stack web apps.
PythonFastAPIReactDocker
05
🇵🇰Islamabad, PK
Internship

CareCloud

Information Technology Intern

  • Built health services REST APIs in .NET Core C# for a live healthcare production system.
.NET CoreC#SQL ServerREST APIs
06
🇵🇰Islamabad, PK
Internship

PTCL

Software Engineer Intern

  • Developed an Employee Record Search desktop application for HR using Python and deep learning.
PythonDeep LearningDesktop App

Writing

Latest
Posts

01
How to Deploy a Computer Vision Model to ProductionComputer Vision

Most CV tutorials end at model training. This guide covers every layer I put in place before any vision model goes live — API design, containerisation, versioning, monitoring, and cost optimisation.

2025Read
02
Building Real-Time IoT Systems with BLE and WebSockets: Lessons from 200Hz+ Sensor StreamingIoT

The hardest part of building wearable tech isn't the AI. It's the 200 milliseconds between the sensor and the screen. Four years of lessons from production IoT systems — BLE reconnection, protocol selection, edge preprocessing, and monitoring.

2025Read
03
LLM-Powered Real-Time Audio Pipelines: How We Built AI Transcription at ScaleReal-Time Audio

Most developers think the hard part of voice AI is the speech-to-text model. It isn't. The hard part is everything around it — the audio ingestion pipeline, the LLM classification layer, the WebSocket architecture, and the operational infrastructure that keeps it all running under production load.

2025Read
04
My Production Deployment Checklist for AI Systems: What I Check Before Every LaunchProduction AI

Every item on this checklist exists because I once shipped without it. Seven layers — crash reporting, analytics, UX feedback, bug tracking, infrastructure monitoring, device fingerprinting, and CDN — that I now run before any AI system goes live.

2025Read
05
RAG Architecture in Production: Building a Research Intelligence System with ChromaDB and BM25RAG

Production RAG fails in specific ways the tutorials skip. I built PaperIntel — a research intelligence system with citation-level accuracy — using hybrid retrieval, cross-encoder reranking, and systematic evaluation. This is what the full architecture actually looks like.

2025Read
06
Model Quantization for Production: How I Cut Inference Cost by 60% Without Touching AccuracyModel Optimization

Your production AI model is probably 4x bigger than it needs to be. I reduced inference time from 340ms to 91ms and cut monthly cloud costs by 60% using INT8 quantization — without changing a single model layer. Here's the full pipeline.

2025Read
07
Monolith to Microservices: How We Achieved 3x Throughput on a Live Production SystemSystem Architecture

Most microservices migrations are driven by architectural fashion rather than specific engineering pain. Ours was driven by a measurable scaling problem. This is the story of migrating a live platform without downtime, what broke in ways we didn't anticipate, and what 3x throughput actually looks like.

2025Read
08
YOLOv8 in Production: Building a Multi-Camera CCTV Anomaly Detection SystemComputer Vision

YOLOv8 benchmarks are well documented. What's not documented is what happens when you process 8 simultaneous CCTV feeds in real time, apply zone-based business rules, and deliver WebSocket alerts under 200ms while keeping false positives low enough that security staff actually trust the system.

2025Read
Available for new projects

CTO & AI Systems Architect

Qalab
Hassnain

CTO at Quickgen Technologies & QuickComm AE. 4+ years building production-grade AI systems, scalable backend architectures, and IoT-integrated platforms — from real-time LLM pipelines to computer vision and cloud-native deployment on AWS and Azure.

Let's work together

Got a
project
in mind?

Qalab Hassnain Agha
QHA

© 2026 Qalab Hassnain Agha

All rights reserved · qalabagha.com

Projects
About
Contact

Built with Next.js · Three.js · GSAP