DEV Community

umesh kushwaha
umesh kushwaha

Posted on

How Uber Eats & Zomato Find “Restaurants Near You"

Search & Discovery Architecture for Location-Based Systems (Deep Dive)

Finding nearby restaurants looks simple. At scale, it’s a hard distributed systems problem involving spatial indexing, ranking, freshness, and tradeoffs between accuracy and performance.

This blog deep-dives into Search & Discovery for location-based systems like:

  • Restaurant discovery (Zomato, Swiggy)
  • Store search (Blinkit, Instamart)
  • Nearby places (Google Maps-lite use cases)

1️⃣ Problem Definition

Inputs

  • User location (latitude, longitude)
  • Optional text query: “pizza”, “burger”, “south indian”
  • Filters: open now, rating, price, cuisine
  • Radius: 1–10 km

Output

  • Ranked list of nearby restaurants
  • Sorted by relevance, distance, and business signals

Constraints

  • Millions of restaurants
  • Low latency (< 200ms P95)
  • High read QPS
  • Data changes, but not every second

2️⃣ Nature of the Data

Understanding data behavior drives architectural decisions.

Attribute Behavior
Location Mostly static
Name / Cuisine Rarely changes
Ratings Periodic updates
Open / Close status Time-based
Query pattern Read-heavy

➡️ This is not a real-time problem
➡️ This is a search and indexing problem


3️⃣ High-Level Architecture


User Request
     ↓
Search API
     ↓
Elasticsearch (Geo + Text)
     ↓
Source DB (Postgres / MySQL)

Optional Redis cache can be added for hot queries.


4️⃣ Why Elasticsearch?

Elasticsearch combines three critical capabilities:

  • Geo-spatial indexing
  • Full-text search
  • Relevance scoring and sorting

Doing all three efficiently in a relational database becomes painful at scale.


5️⃣ Location Modeling

Geo Point Mapping

{
  "location": {
    "type": "geo_point"
  }
}

This allows Elasticsearch to index latitude and longitude efficiently.


6️⃣ Geo Distance Query

{
  "query": {
    "bool": {
      "filter": {
        "geo_distance": {
          "distance": "5km",
          "location": {
            "lat": 12.9716,
            "lon": 77.5946
          }
        }
      }
    }
  }
}

This avoids full scans and only evaluates nearby spatial segments.


7️⃣ How Elasticsearch Computes “Nearby”

Elasticsearch performs spatial preprocessing using:

  • BKD Trees
  • Geohash-based partitioning

Restaurants are indexed into spatial cells. Queries only scan nearby cells.

Distance is calculated inflight only for filtered candidates.


8️⃣ Preprocessing vs Inflight Computation

Aspect Preprocessing Inflight
Spatial segmentation
Distance calculation
Text relevance

9️⃣ Full-Text + Geo Search

Example: “pizza near me”

{
  "query": {
    "bool": {
      "must": {
        "match": {
          "cuisine": "pizza"
        }
      },
      "filter": {
        "geo_distance": {
          "distance": "3km",
          "location": {
            "lat": 12.97,
            "lon": 77.59
          }
        }
      }
    }
  }
}

🔟 Ranking Strategy

Distance alone is insufficient.

Typical conceptual scoring:


Final Score =
  Text Relevance
+ Rating Weight
+ Popularity Score
- Distance Penalty

Scripted Sorting Example

{
  "_script": {
    "type": "number",
    "script": {
      "source": "doc['rating'].value * 2 - doc['distance'].value"
    },
    "order": "desc"
  }
}

1️⃣1️⃣ Radius Search Tradeoffs

  • Small radius → faster, fewer results
  • Large radius → slower, noisier results
  • Dynamic radius → best UX

Common approach: start small, expand if results are insufficient.


1️⃣2️⃣ Caching with Redis

Caching is optional but useful for hot locations.

Example cache key:


city:blr:lat:12.97:lon:77.59:radius:3km

TTL should be short (5–15 minutes).


1️⃣3️⃣ Why Not Redis GEO for Discovery?

Redis GEO Elasticsearch
Fast Slightly slower
No full-text Full-text search
No ranking Advanced ranking
Memory-heavy Disk-backed

Redis GEO is better suited for real-time driver matching.


1️⃣4️⃣ Postgres + PostGIS?

It works, but with limitations.

  • Good for early-stage systems
  • Hard to scale ranking logic
  • Search complexity grows quickly

At scale, Elasticsearch wins.


1️⃣5️⃣ Data Freshness

  • Ratings → async index updates
  • Menu changes → delayed propagation
  • Location changes → extremely rare

Near real-time consistency is acceptable for discovery.


1️⃣6️⃣ Key Tradeoffs

  • Elasticsearch complexity vs scalability
  • Preprocessing vs storage cost
  • Composite ranking vs explainability
  • Caching vs freshness

Closing Thoughts

Search & Discovery is fundamentally:

  • An indexing problem
  • A ranking problem
  • A read-scalability problem

Top comments (0)