Day 35/60 System Design Questions

#systemdesign #distributedsystems #backend #database

This article was written with the assistance of AI tooling for structure and syntax. The concepts, tradeoffs, and production context are based on my own engineering experience and research. #ABotWroteThis

You're building the "find nearby drivers" feature for a ride-hailing app.

At peak, you have 500,000 active drivers updating their GPS location every 5 seconds. Riders query for drivers within 2km. At scale, you're doing ~100,000 proximity queries per second.

Your naive implementation does this:

SELECT * FROM drivers
WHERE lat BETWEEN ? AND ?
AND lng BETWEEN ? AND ?

It works fine at 1,000 drivers. At 500,000, it's a full table scan on every query. Latency hits 800ms. Riders see a spinner. Drivers miss trips.

Your team proposes 4 approaches to fix this:

A) Geohash partitioning — Encode each driver's location into a geohash string. Index by geohash prefix. Proximity queries become a string lookup on the index.

B) PostGIS with spatial indexes — Add a PostGIS extension to Postgres. Use a proper R-tree/GiST spatial index for bounding-box and radius queries.

C) Quadtree in memory — Keep all active driver positions in a quadtree data structure in a Redis-backed in-memory service. Decompose space recursively until each cell has ≤ N drivers.

D) H3 hexagonal grid (Uber's system) — Divide the earth into hexagonal cells at multiple resolutions. Assign each driver to a cell. Queries check the target cell + 6 neighbors at the right resolution.

You need sub-50ms p99 latency, real-time updates, and it has to stay accurate at cell boundaries.

Pick one — A, B, C, or D — and tell me why. Full breakdown in the comments.

If your team has argued about spatial indexing before, share this. Worth the debate.

Drop your answer 👇

30DaysOfSystemDesign #SystemDesign #SoftwareArchitecture #DistributedSystems

Top comments (4)

Joud Awad • Jun 10

Answer: D — H3 Hexagonal Grid ✅

Why D wins:
Uber open-sourced H3 for exactly this problem. Hexagons have 6 neighbors, all equidistant from the center. Squares (geohash) have corner neighbors that are ~40% farther — this distorts "within 2km" queries. H3 also has 16 resolution levels (no reindexing on zoom changes). The killer feature: the k-ring query (target cell + 6 neighbors) guarantees no driver gets missed near a boundary. Uber does sub-10ms lookups at millions of drivers: get cells in ring → fetch IDs from Redis set keyed by cell. Lyft, Airbnb, DoorDash all use variants.

Joud Awad • Jun 10

Why A is the trap (Geohash):
Production-grade, widely used — and wrong here. Geohash uses a Z-order curve (interleaved bits) that doesn't guarantee geographic adjacency for adjacent prefix strings. Two drivers 1 meter apart can have entirely different prefix strings if they straddle a cell boundary. This becomes a systematic miss rate at 100k queries/sec. Geohash is right for coarse partitioning (sharding by region), wrong for real-time sub-km proximity.

Joud Awad • Jun 10

Why B fails at scale (PostGIS):
PostGIS is correct — it's the right answer for 5,000 trucks on a fleet dashboard. Not for 500k drivers updating every 5s. The math: 500k × 12 updates/min = 100k writes/sec against a GiST index. GiST indexes are write-expensive. Your DB burns half its capacity on index maintenance. And you can't easily scale Postgres writes horizontally. PostGIS is the starting point — H3 + Redis is where you land when it breaks.

Joud Awad • Jun 10

Why C almost works (Quadtree):
O(log n) insert/query, great in theory. Breaks at ride-hailing density: driver crossing a cell boundary = delete + reinsert + potential rebalancing, 100k times per second. Same rectangular boundary problem as geohash. Dense cities create deep uneven trees. A quadtree is a building block — not a complete solution at this scale.