🌐 Live demo (LOOK · UNDERSTAND · BUILD): https://dev48v.infy.uk/solve/day2-pothole-detector.html
Day 2 of SolveFromZero — pick a real hackathon problem, ship a working solution. Today's brief is from Smart India Hackathon: Indian roads have ~9 million potholes annually. Municipal complaints rely on citizens manually photographing each one. Build a system that watches dashcam video, detects potholes, GPS-tags them, and posts to the civic complaint API.
This is also the template for any "frame-by-frame civic detection" — garbage dumps, illegal parking, missing signage, encroachment. Same pipeline, swap the model.
The pipeline in 6 stages
📹 Video stream (30 FPS)
↓
🎯 YOLOv8 per-frame object detection
↓
✅ Confidence threshold (filter false positives)
↓
📍 GPS pairing (lat/lng from parallel GPX track)
↓
🔗 Cluster duplicates (same pothole in 30 frames → 1 complaint)
↓
📤 POST to civic complaint API
Each stage is ~20 lines of Python. Total: ~150 lines for the whole thing.
Stage 1+2 — YOLO frame detection
YOLOv8 = state-of-the-art object detector. The n (nano) variant is 14 MB and runs ~30 FPS on a CPU laptop. Roboflow Universe hosts pre-trained pothole-detection weights — no labeling required.
from ultralytics import YOLO
import cv2
model = YOLO("pothole-yolov8n.pt") # 14 MB pre-trained
cap = cv2.VideoCapture("dashcam.mp4")
while cap.isOpened():
ok, frame = cap.read()
if not ok: break
results = model(frame, conf=0.6) # confidence threshold
for box in results[0].boxes:
x1, y1, x2, y2 = box.xyxy[0]
conf = float(box.conf)
cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 2)
OpenCV reads MP4 files, RTSP streams, or webcams interchangeably. Same code, different source.
Stage 3 — Confidence tuning is the killer detail
Each detection has a confidence score 0-1. The threshold dictates your tradeoff:
| Threshold | Recall | Precision | What happens |
|---|---|---|---|
| 0.3 | 100% | 60% | Reports every shadow / manhole / dark stain |
| 0.6 | ~90% | ~85% | Sweet spot for road damage |
| 0.9 | 60% | 99% | High precision but misses smaller potholes |
Production deployments typically run at 0.55-0.7 for road-damage detection. Below that = too many citizen complaints filed for shadows. Above = miss the actual problem you're trying to solve.
Stage 4 — GPS pairing
A pothole at lat/lng is 100× more useful than "somewhere in this video." Dashcams usually record GPS either:
- Embedded in video metadata (GoPro GMD telemetry, Garmin VIRB)
- Parallel GPX file (most dedicated dashcams)
- Separate phone log (if you used a phone-based recorder)
Match each video frame's timestamp to the closest GPS sample:
import gpxpy
gpx = gpxpy.parse(open("track.gpx"))
points = [(p.time, p.latitude, p.longitude)
for trk in gpx.tracks
for seg in trk.segments
for p in seg.points]
def gps_at(frame_time):
# Binary search by time, return (lat, lng)
return interpolate(points, frame_time)
Binary search by time, interpolate between the two closest GPS fixes.
Stage 5 — Cluster duplicate detections
You drive past one pothole → it appears in 30 consecutive frames. Without clustering, you'd file 30 complaints for the same hole. Group detections within ~10 m of each other into one complaint, keeping the highest-confidence frame as the evidence.
Cheap approximation: round lat/lng to 4 decimal places (~11 m at the equator). Group by the rounded tuple.
from itertools import groupby
def key(d):
return (round(d.lat, 4), round(d.lng, 4))
clusters = []
for k, grp in groupby(sorted(detections, key=key), key=key):
grp_list = list(grp)
best = max(grp_list, key=lambda d: d.conf)
clusters.append({
"lat": best.lat, "lng": best.lng,
"conf": best.conf, "frame": best.frame
})
Production should use proper geo-hashing (Uber's H3 library) for scale. For a hackathon demo, rounding is fine.
Stage 6 — Post to civic API
Most Indian municipal corporations rolled out JSON complaint APIs in 2023-2024 — MCGM (Mumbai), BBMP (Bengaluru), MCG-Pune, GHMC (Hyderabad). Format varies but the shape is universal:
import requests
for cluster in clusters:
requests.post("https://api.mcgm.gov.in/v1/complaints", json={
"category": "pothole",
"lat": cluster["lat"],
"lng": cluster["lng"],
"confidence": cluster["conf"],
"image_url": s3_upload(cluster["frame"]),
"reporter": "dashcam-auto-v1"
})
The municipality dashboard already exists. Your job is the auto-filing pipeline.
Try it now
The 3-tier learning page has a simulated dashcam + live YOLO bounding-box overlay + a feed of complaints filed in real time:
https://dev48v.infy.uk/solve/day2-pothole-detector.html
- LOOK — start the simulated drive, drag the confidence threshold, watch detections appear or get rejected
- UNDERSTAND — 9 click-through steps explaining YOLO, OpenCV, threshold tuning, GPS pairing, clustering
-
BUILD — full Python pipeline with real
pip installcommands
What this unlocks
Same skeleton works for any frame-by-frame civic detection:
| Problem | Model swap |
|---|---|
| Pothole detection | pothole-yolov8n.pt |
| Illegal parking | Vehicle detection + zone polygon |
| Garbage dump tracking |
litter-yolov8n.pt (Roboflow has it) |
| Red-light running | Traffic-light detection + vehicle motion |
| Sidewalk encroachment | Obstruction detection + sidewalk geofence |
The video-loop + detect + threshold + GPS + cluster + POST shape is invariant.
What's next in SolveFromZero
Day 3: AI crop disease identifier (Smart India Hackathon). Leaf photo → disease classification + remedy suggestion. Edge AI in the village ag-extension officer's pocket.
🌐 All problems: https://dev48v.infy.uk/solvefromzero.php
Top comments (0)