<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tuấn Anh</title>
    <description>The latest articles on DEV Community by Tuấn Anh (@vesviet).</description>
    <link>https://dev.to/vesviet</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1552279%2F3c8967c3-8340-4362-a0bf-f4663d852a2a.jpg</url>
      <title>DEV Community: Tuấn Anh</title>
      <link>https://dev.to/vesviet</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vesviet"/>
    <language>en</language>
    <item>
      <title>Escaping the Monolith: How We Replaced Magento with 21 Go Microservices</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Fri, 26 Jun 2026 15:00:00 +0000</pubDate>
      <link>https://dev.to/vesviet/escaping-the-monolith-how-we-replaced-magento-with-21-go-microservices-30je</link>
      <guid>https://dev.to/vesviet/escaping-the-monolith-how-we-replaced-magento-with-21-go-microservices-30je</guid>
      <description>&lt;p&gt;![Cover Image: A monolithic block breaking into Go microservices]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Answer-first:&lt;/strong&gt; How we escaped Magento's licensing and scaling walls by migrating to a Composable Commerce Platform built on 21 Go microservices, Dapr PubSub, and a 3-phase Strangler Fig strategy without dropping a single order.&lt;/p&gt;

&lt;p&gt;At exactly midnight during a major campaign, a monolithic Magento server can quickly become your single point of failure. Every engineering team that builds seriously on Magento eventually hits the same walls: the licensing wall ($100k-$200k/year for Enterprise), the scaling wall (scaling the entire massive codebase just to handle a traffic spike on the checkout page), and the developer velocity wall. &lt;/p&gt;

&lt;p&gt;When we decided to migrate away from Magento, the standard industry advice was to split the monolith into "4 to 6 microservices". We ignored that advice. For serious e-commerce at scale, that approach inevitably leads to a &lt;strong&gt;distributed monolith&lt;/strong&gt; — where services are deployed separately but remain tightly coupled through HTTP chains or shared database tables. &lt;/p&gt;

&lt;p&gt;Instead, we built a &lt;strong&gt;Composable Commerce Platform&lt;/strong&gt; using &lt;strong&gt;21 Go microservices&lt;/strong&gt;, Google's Kratos v2 framework, and Dapr PubSub. Here is the blueprint of how we structured it, and how we migrated with zero downtime.&lt;/p&gt;

&lt;p&gt;For the complete architecture deep-dive across all domains, see the &lt;a href="https://tanhdev.com/series/composable-commerce-migration/" rel="noopener noreferrer"&gt;Composable Commerce Migration Series&lt;/a&gt;. &lt;/p&gt;




&lt;h2&gt;
  
  
  The 21-Service Blueprint: Domain-Driven Design in Practice
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of Go code, we established boundaries using &lt;strong&gt;Domain-Driven Design (DDD)&lt;/strong&gt;. Every domain owns its own PostgreSQL database. There are absolutely no cross-domain queries. &lt;/p&gt;

&lt;p&gt;Here is the breakdown of our 6 core domains:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Commerce Flow&lt;/strong&gt; (Checkout, Order, Payment): The highly critical money path, orchestrated using Saga patterns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product &amp;amp; Content&lt;/strong&gt; (Catalog, Pricing, Promotion, Search): Read-heavy, heavily cached, requiring sub-50ms latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logistics&lt;/strong&gt; (Warehouse, Fulfillment, Shipping): Integration with the physical world and 3PLs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post-Purchase&lt;/strong&gt; (Returns, Loyalty): Customer retention and post-sale state machines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity &amp;amp; Access&lt;/strong&gt; (Auth, User, Customer): Strict separation between internal staff (RBAC) and external customer PII.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform Operations&lt;/strong&gt; (Gateway, Analytics, Notification): Shared infrastructure utilities.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here is how traffic flows through the ecosystem:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpqp6jx2v9pdt9v8a3b1z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpqp6jx2v9pdt9v8a3b1z.png" alt=" " width="784" height="469"&gt;&lt;/a&gt;&lt;br&gt;
![Commerce Ecosystem Architecture]&lt;br&gt;
&lt;em&gt;(For the full traffic anatomy and Elasticsearch CQRS implementation, see the &lt;a href="https://tanhdev.com/posts/blueprint-ecommerce-microservices-architecture-diagram/" rel="noopener noreferrer"&gt;Architecture Blueprint&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The 3 Rules of Decoupling: Avoiding the Distributed Monolith
&lt;/h2&gt;

&lt;p&gt;The most common failure mode when migrating from a monolith is building a system where a single failure in a non-critical service cascades and takes down the checkout flow. We avoided this with three hard rules:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. No cross-domain database queries.&lt;/strong&gt;&lt;br&gt;
If the &lt;code&gt;Order&lt;/code&gt; service needs product data, it cannot query the &lt;code&gt;Catalog&lt;/code&gt; database. It must either hold a denormalized copy (via CQRS) or call the &lt;code&gt;Catalog&lt;/code&gt; service's gRPC API. This ensures schema changes in one domain never break another.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. No synchronous calls in the async event path.&lt;/strong&gt;&lt;br&gt;
Once an event (e.g., &lt;code&gt;order.paid&lt;/code&gt;) enters the Dapr PubSub mesh, it is processed independently. Downstream services like &lt;code&gt;Fulfillment&lt;/code&gt; or &lt;code&gt;Loyalty&lt;/code&gt; react to the event, but they &lt;strong&gt;never synchronously call back&lt;/strong&gt; into the producer. If the &lt;code&gt;Loyalty&lt;/code&gt; service is down, the order still succeeds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. No shared deployment pipelines.&lt;/strong&gt;&lt;br&gt;
Every service lives in a Rush monorepo but has its own ArgoCD Application and container registry path. A bug or a blocked deployment in the &lt;code&gt;Notification&lt;/code&gt; service cannot block a hotfix for the &lt;code&gt;Checkout&lt;/code&gt; service.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Zero-Downtime Migration Strategy: The 3-Phase Strangler Fig
&lt;/h2&gt;

&lt;p&gt;Magento's EAV (Entity-Attribute-Value) schema is a trap. You cannot just run an ETL job over the weekend and cut over traffic. We used a strict &lt;strong&gt;3-Phase Strangler Fig&lt;/strong&gt; approach to ensure zero dropped orders:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Phase 1 (Read-Only via CDC):&lt;/strong&gt; We deployed the Go microservices in read-only mode behind our API Gateway. We used &lt;strong&gt;Debezium&lt;/strong&gt; to stream Magento MySQL changes to our Go services via Dapr. Writes still went to Magento.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 2 (Dual-Write &amp;amp; Conflict Resolution):&lt;/strong&gt; Microservices started accepting writes, persisting to their own Postgres DBs, and publishing domain events. A sync-adapter service listened to these events and wrote back to Magento. Conflicts were resolved by explicit policies (e.g., timestamp-wins).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (Full Cutover via GitOps):&lt;/strong&gt; Using ArgoCD, we gradually shifted traffic per service: 25% → 50% → 75% → 100%. Magento was kept on "hot standby" for a 30-day rollback window.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Takeaways for Engineers Migrating Monoliths
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Establish boundaries before writing code.&lt;/strong&gt; Do not extract services based on UI pages. Extract them based on business domains (DDD) and data ownership.&lt;br&gt;
&lt;strong&gt;2. Enforce the DB-per-service rule strictly.&lt;/strong&gt; The moment you allow two services to read the same database table, you have coupled their deployment lifecycles.&lt;br&gt;
&lt;strong&gt;3. CDC is your best friend during migration.&lt;/strong&gt; Using Debezium and Kafka/Dapr to stream data out of the legacy monolith is much safer than building massive REST APIs on the legacy system just for synchronization.&lt;br&gt;
&lt;strong&gt;4. Do not do a "big bang" release.&lt;/strong&gt; The Strangler Fig pattern takes longer to build (due to dual-write adapters), but it is the only way to migrate an active e-commerce platform without risking catastrophic revenue loss.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Go (Golang) instead of Node.js or Java for microservices?
&lt;/h3&gt;

&lt;p&gt;Go provides the perfect balance for e-commerce microservices: it compiles to a single static binary (fast startup for scaling), uses very little memory compared to JVM languages, and has excellent concurrency primitives (goroutines) for handling high-throughput API gateways and event consumers. &lt;/p&gt;

&lt;h3&gt;
  
  
  How do you handle distributed transactions across 21 services?
&lt;/h3&gt;

&lt;p&gt;We don't do distributed locks or 2PC (Two-Phase Commit). We use the &lt;strong&gt;Saga Pattern&lt;/strong&gt; orchestrated by the Checkout service, leveraging Dapr PubSub. If a step fails, the orchestrator fires compensating events to roll back previous steps. &lt;/p&gt;

&lt;h3&gt;
  
  
  Does the API Gateway become a new monolith?
&lt;/h3&gt;

&lt;p&gt;No. The gateway only handles Auth, Rate Limiting, and basic routing. We strictly forbid placing business logic in the API Gateway. It acts purely as a traffic shield and BFF (Backend-For-Frontend).&lt;/p&gt;

&lt;p&gt;For the full engineering blueprint — including the exact EAV extraction SQL and our Dapr Saga implementation — read the complete &lt;a href="https://tanhdev.com/series/composable-commerce-migration/" rel="noopener noreferrer"&gt;Composable Commerce Migration Series&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>go</category>
      <category>microservices</category>
      <category>systemdesign</category>
      <category>architecture</category>
    </item>
    <item>
      <title>[System Design] Chapter 2: Flash Sale Engine - Solving Overselling and Hot Keys</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Tue, 23 Jun 2026 23:38:50 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-chapter-2-flash-sale-engine-solving-overselling-and-hot-keys-3pb9</link>
      <guid>https://dev.to/vesviet/system-design-chapter-2-flash-sale-engine-solving-overselling-and-hot-keys-3pb9</guid>
      <description>&lt;p&gt;&lt;a href="https://dev.to/series/shopee-architecture/"&gt;← Series hub&lt;/a&gt;&lt;br&gt;
&lt;a href="https://dev.to/series/shopee-architecture/01-microservices-foundation/"&gt;← Prev&lt;/a&gt; • &lt;a href="https://dev.to/series/shopee-architecture/03-traffic-shield/"&gt;Next →&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  Chapter 2: Flash Sale Engine - The Mystery Behind Redis and Hot Keys
&lt;/h1&gt;

&lt;p&gt;Flash Sale events are the ultimate stress test for system architecture. When an iPhone is sold for $1, millions of users will smash the "Buy Now" button in the exact same millisecond. If this massive spike hits a MySQL database directly, the system will instantly crash due to Row Locks and Deadlocks. &lt;/p&gt;
&lt;h2&gt;
  
  
  1. The Hot Key Problem and Two-Tier Caching
&lt;/h2&gt;

&lt;p&gt;A highly discounted product is known as a &lt;strong&gt;Hot Key&lt;/strong&gt;.&lt;br&gt;
Many developers mistakenly believe that "just putting inventory in Redis" solves everything. However, a single Redis node has Network Bandwidth and CPU limits (typically maxing out at ~100k Ops/sec). One million clicks on a single key will saturate the network interface card (NIC) of that Redis node.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shopee's Solution: Multi-Level Caching&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tier 1 (Local Cache):&lt;/strong&gt; Built directly into the RAM of the Golang Application Servers (using tools like &lt;code&gt;sync.Map&lt;/code&gt; or &lt;code&gt;BigCache&lt;/code&gt;). This local cache only stores a boolean flag: "Is the item still in stock?". It has a TTL of just 1-2 seconds but successfully blocks 90% of useless traffic from hitting the network once the item is sold out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tier 2 (Distributed Cache - Redis):&lt;/strong&gt; Only when the Local Cache reports that the item is available does the request proceed to the Redis cluster.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  2. Preventing Overselling with Atomic Lua Scripts
&lt;/h2&gt;

&lt;p&gt;When a user buys an item, the system must deduct the inventory. But if you use standard commands: Read stock (GET) -&amp;gt; Check if &amp;gt; 0 -&amp;gt; Write new stock (SET), you will face a critical &lt;strong&gt;Race Condition&lt;/strong&gt;. Two parallel threads might both read a stock value of 1, both decrement it, and result in selling two items when only one existed (Overselling).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Shopee wraps the inventory deduction logic inside &lt;strong&gt;Lua Scripts&lt;/strong&gt; running natively within Redis. Because Redis is fundamentally single-threaded, executing a Lua script acts as an &lt;strong&gt;Atomic Transaction&lt;/strong&gt;—no other requests can interrupt it mid-execution.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Example Lua Script for Inventory Deduction&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;stock_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stock_key&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'DECR'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stock_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="c1"&gt;-- Purchase Successful&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="c1"&gt;-- Out of Stock&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fail Fast:&lt;/strong&gt; Thanks to this mechanism, if the Lua script returns 0, the request is immediately rejected and the user sees an "Out of Stock" message. This RAM-level operation takes mere microseconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Inventory Sharding
&lt;/h2&gt;

&lt;p&gt;For mega-campaigns, a single Hot Key on a single Redis Node is still too risky. Shopee employs &lt;strong&gt;Inventory Sharding&lt;/strong&gt;.&lt;br&gt;
If there are 1,000 iPhones, they do not store the number 1,000 in a single key &lt;code&gt;iphone_stock&lt;/code&gt;. Instead, they slice it into 10 shards: &lt;code&gt;iphone_stock_1&lt;/code&gt; to &lt;code&gt;iphone_stock_10&lt;/code&gt;. Each key holds 100 items and is distributed across 10 different physical Redis Nodes.&lt;/p&gt;

&lt;p&gt;A load balancer or router randomly routes incoming user traffic to one of those 10 keys, instantly dividing the massive system pressure by 10.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sequenceDiagram
    participant User
    participant App as Golang Server&amp;lt;br/&amp;gt;(Local Cache)
    participant Redis as Redis Cluster&amp;lt;br/&amp;gt;(Sharded)
    participant Worker as Kafka Worker

    User-&amp;gt;&amp;gt;App: Click "Buy Now"
    Note over App: Check Local Cache.&amp;lt;br/&amp;gt;Block if Out of Stock
    App-&amp;gt;&amp;gt;Redis: Route to shard (e.g. stock_3)
    Note over Redis: Execute Atomic Lua Script
    Redis--&amp;gt;&amp;gt;App: If 0: Return Error
    Redis--&amp;gt;&amp;gt;Worker: If 1: Push Order Event to Queue
    Worker--&amp;gt;&amp;gt;User: Process Order Asynchronously
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Developer Takeaway:&lt;/strong&gt; RAM and caching are your strongest weapons against heavy traffic. However, do not blindly rely on a Distributed Cache. Combine it with Local Caches on the App Server to save network bandwidth, and always use Lua Scripts to guarantee data consistency when handling sensitive numbers like inventory or wallet balances.&lt;/p&gt;




&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://medium.com/@kiki.syah/inventory-system-design-to-handle-flash-sales-37fc2e8dcffb" rel="noopener noreferrer"&gt;Handling Flash Sales with Redis and Lua (Medium)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/@soesah/how-to-handle-flash-sales-using-redis-c02058e0a811" rel="noopener noreferrer"&gt;Solving the Hot Key Problem with Inventory Sharding&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://careers.shopee.sg/blog/" rel="noopener noreferrer"&gt;Shopee Engineering Blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;{{&amp;lt; author-cta &amp;gt;}}&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/shopee-architecture/02-flash-sale-engine/" rel="noopener noreferrer"&gt;Chapter 2: Flash Sale Engine - Solving Overselling and Hot Keys&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>highconcurrency</category>
      <category>ecommerce</category>
    </item>
    <item>
      <title>[System Design] Banking Microservices Architecture: Event Sourcing, CQRS &amp; Saga Patterns for Core Banking</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Sun, 21 Jun 2026 00:28:28 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-banking-microservices-architecture-event-sourcing-cqrs-saga-patterns-for-core-5720</link>
      <guid>https://dev.to/vesviet/system-design-banking-microservices-architecture-event-sourcing-cqrs-saga-patterns-for-core-5720</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series context (Part 4 of 8):&lt;/strong&gt; This article assumes familiarity with &lt;a href="https://dev.to/series/core-banking-developer/part-3-database-transactions-acid/"&gt;ACID transactions and database concurrency&lt;/a&gt;. Understanding why consistency guarantees are hard at the database layer is essential context before introducing distributed patterns here.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why Microservices in Banking?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Microservices in banking&lt;/strong&gt; is the architectural pattern where a core banking system is broken into independently deployable, domain-owned services (CIF, Payments, Lending, Notifications) connected by an event bus instead of direct database calls. This replaces monolithic systems like T24 or Flexcube — where a single change to the Payments module requires redeploying the entire application and risks taking down unrelated services.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High-risk deployments:&lt;/strong&gt; Modifying a small module requires redeploying the entire system. A patch to the Payments module can take down CIF.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inefficient scaling:&lt;/strong&gt; You cannot scale just the Payments module during peak loads without scaling everything else — including parts that don't need more capacity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technology lock-in:&lt;/strong&gt; Bound to a single programming language and database. Adding a modern ML risk engine becomes an 18-month integration project.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The current trend is transitioning to Headless Core Banking&lt;/strong&gt; — decoupling the domain logic from the delivery channels (Mobile App, Internet Banking, ATM) using a banking microservices architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  Overall Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌─────────────────────────────────────┐
  CHANNELS          │  Mobile App  │  Internet Banking  │  ATM/POS  │
                    └──────────────────────┬──────────────────────────┘
                                           │ REST/gRPC
                    ┌──────────────────────▼──────────────────────────┐
  API GATEWAY       │         API Gateway (Auth, Rate Limit, Routing)  │
                    └──────────────────────┬──────────────────────────┘
                                           │
         ┌─────────────────────────────────┼──────────────────────────┐
         │                                 │                          │
  ┌──────▼──────┐               ┌──────────▼─────────┐    ┌──────────▼──────────┐
  │ CIF Service │               │  Account Service   │    │ Payment Service     │
  │ (Customer)  │               │  (CASA, GL)        │    │ (Transfers, Fees)   │
  └─────────────┘               └────────────────────┘    └─────────────────────┘
         │                                 │                          │
         └─────────────────────────────────┼──────────────────────────┘
                                           │ Events (Kafka/Dapr)
                    ┌──────────────────────▼──────────────────────────┐
  EVENT BUS         │              Message Broker (Kafka / Redis)      │
                    └──────────────────────┬──────────────────────────┘
                                           │
         ┌─────────────────────────────────┼──────────────────────────┐
         │                                 │                          │
  ┌──────▼──────┐               ┌──────────▼─────────┐    ┌──────────▼──────────┐
  │ Loan Service│               │ Notification Svc   │    │ Reporting Service   │
  │ (Lending)   │               │ (SMS, Push, Email) │    │ (CQRS Read Side)    │
  └─────────────┘               └────────────────────┘    └─────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Pattern 1: Event Sourcing for the Ledger
&lt;/h2&gt;

&lt;p&gt;In traditional architectures, we store the &lt;strong&gt;current state&lt;/strong&gt;. In Event Sourcing, we store a &lt;strong&gt;sequence of immutable events&lt;/strong&gt; that produce that state.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why does Event Sourcing fit Core Banking?
&lt;/h3&gt;

&lt;p&gt;The ledger is already essentially Event Sourcing — every entry is an immutable event. The current balance is simply the result of &lt;strong&gt;replaying&lt;/strong&gt; all entries from the beginning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Events in the Account domain&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;AccountOpened&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;AccountID&lt;/span&gt;    &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;CIFNumber&lt;/span&gt;    &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Currency&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;OpenedAt&lt;/span&gt;     &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;MoneyDeposited&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;AccountID&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Amount&lt;/span&gt;        &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="n"&gt;TransactionID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;OccurredAt&lt;/span&gt;    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;MoneyWithdrawn&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;AccountID&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Amount&lt;/span&gt;        &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="n"&gt;TransactionID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;OccurredAt&lt;/span&gt;    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Calculate balance by replaying events&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;calculateBalance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;MoneyDeposited&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Amount&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;MoneyWithdrawn&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Amount&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Pattern 2: CQRS — Command Query Responsibility Segregation
&lt;/h2&gt;

&lt;p&gt;Core Banking has a unique characteristic: &lt;strong&gt;writes must be exceptionally robust (ACID)&lt;/strong&gt; but &lt;strong&gt;reads need to be lightning fast&lt;/strong&gt; (dashboards, reports). CQRS completely separates these two flows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WRITE SIDE (Command)                READ SIDE (Query)
────────────────────────            ──────────────────────────
POST /transfers            →        Materialized Views
POST /accounts             →        Elasticsearch Index
PUT /loans/repay           →        Redis Cache

↓ Event Published ↓                ↑ Subscribe &amp;amp; Update ↑
         └──────────────────────────┘
              (Event Bus / Kafka)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-world Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Write Side:&lt;/strong&gt; Processes transfers using PostgreSQL with full ACID compliance, guaranteeing money isn't lost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read Side:&lt;/strong&gt; Dashboards display transaction history from Elasticsearch — ultra-fast queries, full-text search, and multi-condition filtering.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pattern 3: Saga — Distributed Transactions Across Services
&lt;/h2&gt;

&lt;p&gt;When a cross-bank transfer requires coordinating 3 services: &lt;strong&gt;Account Service&lt;/strong&gt; (deduct money), &lt;strong&gt;Payment Service&lt;/strong&gt; (send to clearing house), and &lt;strong&gt;Notification Service&lt;/strong&gt; (send SMS), how do you ensure integrity?&lt;/p&gt;

&lt;h3&gt;
  
  
  Choreography Saga (Event-Driven)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Account Service                Payment Service           Notification Service
      │                               │                          │
      │── TransferInitiated ──────────▶│                          │
      │                               │── PaymentSubmitted ──────▶│
      │                               │                          │── SMS Sent
      │◀── PaymentCompleted ──────────│                          │
      │                               │                          │
   (release hold)                                            (done)

If Payment fails:
      │◀── PaymentFailed ─────────────│
      │                               │
   (cancel hold, refund)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Outbox Pattern — Guaranteeing Events are Never Lost
&lt;/h3&gt;

&lt;p&gt;Problem: What if a service successfully commits to the database but fails to publish the event to Kafka?&lt;/p&gt;

&lt;p&gt;Solution: &lt;strong&gt;Write the event to the database within the same transaction, then have a separate worker read and publish it to Kafka.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Outbox table: written in the same transaction as business data&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;outbox_events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt;       &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 'account.transfer.completed'&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;     &lt;span class="n"&gt;JSONB&lt;/span&gt;        &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt;      &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;  &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;  &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;published_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Inside the same Database Transaction:&lt;/span&gt;
&lt;span class="c1"&gt;-- 1. Update account balance&lt;/span&gt;
&lt;span class="c1"&gt;-- 2. Write ledger entries  &lt;/span&gt;
&lt;span class="c1"&gt;-- 3. INSERT into outbox_events&lt;/span&gt;

&lt;span class="c1"&gt;-- Separate worker running periodically:&lt;/span&gt;
&lt;span class="c1"&gt;-- SELECT * FROM outbox_events WHERE status = 'PENDING'&lt;/span&gt;
&lt;span class="c1"&gt;-- → Publish to Kafka&lt;/span&gt;
&lt;span class="c1"&gt;-- → UPDATE status = 'PUBLISHED'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  API Design for Financial Transactions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Design Principles
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stateless APIs:&lt;/strong&gt; Every request must contain all necessary information.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mandatory Idempotency headers&lt;/strong&gt; for all state-changing APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict separation between Request (commands) and Status (polling).&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /v1/transfers                    → Initiate transfer command
  Header: Idempotency-Key: &amp;lt;uuid&amp;gt;
  Body: { from, to, amount, currency }
  Response: { transfer_id, status: "PROCESSING" }

GET  /v1/transfers/{transfer_id}      → Check result
  Response: { status: "COMPLETED" | "FAILED", ... }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Never design a transfer API as a &lt;strong&gt;synchronous block&lt;/strong&gt; because processing through a central clearing network (like SWIFT) can take anywhere from seconds to minutes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technical Stack Selection
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Popular Choices&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Service Framework&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Go (Kratos, Fiber), Java (Spring Boot)&lt;/td&gt;
&lt;td&gt;High performance, type-safe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary Database&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL&lt;/td&gt;
&lt;td&gt;Strong ACID, flexible JSONB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cache&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redis&lt;/td&gt;
&lt;td&gt;Balances, sessions, rate limiting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Event Bus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Apache Kafka, Dapr PubSub&lt;/td&gt;
&lt;td&gt;Durable, ordered, replayable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Service Mesh&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Istio, Dapr&lt;/td&gt;
&lt;td&gt;mTLS, circuit breaking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Orchestration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kubernetes&lt;/td&gt;
&lt;td&gt;Auto-scaling, self-healing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://microservices.io/" rel="noopener noreferrer"&gt;Microservices Patterns: Saga and Transactional Outbox (Chris Richardson)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mambu.com/composable-banking" rel="noopener noreferrer"&gt;Mambu: Composable Banking Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://thoughtmachine.net/vault-core" rel="noopener noreferrer"&gt;Thought Machine: Vault Core Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://martinfowler.com/cqrs.html" rel="noopener noreferrer"&gt;Martin Fowler: Event Sourcing &amp;amp; CQRS&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🔗 &lt;strong&gt;Previous Step:&lt;/strong&gt; Explore the foundational database layer in &lt;a href="https://dev.to/series/core-banking-developer/part-3-database-transactions-acid/"&gt;Part 3 — Database Design for Financial Transactions (ACID &amp;amp; Concurrency)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Next Step:&lt;/strong&gt; Now that you understand banking microservices architecture and its event-driven patterns, see how these services communicate with the outside world through international financial standards. Continue reading &lt;a href="https://dev.to/series/core-banking-developer/part-5-iso-standards-integration/"&gt;Part 5 — International Integration Standards: ISO 8583 &amp;amp; ISO 20022&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Deep Dive:&lt;/strong&gt; For a complete engineering guide to the full composable banking stack — ledger concurrency patterns, Strangler Fig migrations, RFC 8705 mTLS, and the next-gen vendor landscape — see &lt;a href="https://dev.to/posts/composable-banking-architecture"&gt;Composable Banking Architecture: From Monolith to Modular Core&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/core-banking-developer/part-4-modern-core-banking-architecture/" rel="noopener noreferrer"&gt;Banking Microservices Architecture: Event Sourcing, CQRS &amp;amp; Saga Patterns for Core Banking&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>architecture</category>
      <category>microservices</category>
      <category>fintech</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>[System Design] GraphHopper Distance Matrix: Self-Host OSRM vs Haversine for Route Optimization</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Fri, 19 Jun 2026 01:02:51 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-graphhopper-distance-matrix-self-host-osrm-vs-haversine-for-route-optimization-2cpg</link>
      <guid>https://dev.to/vesviet/system-design-graphhopper-distance-matrix-self-host-osrm-vs-haversine-for-route-optimization-2cpg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series context:&lt;/strong&gt; This is Part 7 of the &lt;a href="https://dev.to/series/ecommerce-order-allocation/"&gt;E-commerce Order Allocation&lt;/a&gt; series. The distance matrix built here feeds directly into the OR-Tools VRP solver in &lt;a href="https://dev.to/series/ecommerce-order-allocation/part-6-build-mini-allocation-engine/"&gt;Part 6&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Invisible yet Most Expensive Bottleneck in E-commerce Routing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Answer-first:&lt;/strong&gt; For 1 warehouse + 100 delivery stops, you need 10,201 pairwise distances. Self-hosting &lt;strong&gt;GraphHopper&lt;/strong&gt; or &lt;strong&gt;OSRM&lt;/strong&gt; on a $20/month VPS delivers this in under 50ms per batch — completely free. Using Google Maps Distance Matrix API for the same workload costs &lt;strong&gt;$510/day&lt;/strong&gt;. This guide shows you exactly when to use which tool, with Docker setup and production Python code.&lt;/p&gt;

&lt;p&gt;For any VRP (Vehicle Routing Problem) solver to find the optimal delivery route, it needs to know the exact cost between every pair of stops — this is the &lt;strong&gt;distance matrix&lt;/strong&gt;. For 1 warehouse + 100 orders (101 points), that is &lt;code&gt;101 × 101 = 10,201&lt;/code&gt; pairs. Choosing the wrong tool for this step can cost &lt;strong&gt;$510/day&lt;/strong&gt; in API fees or cause multi-second latency spikes under load.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. As the Crow Flies: The Haversine Formula
&lt;/h2&gt;

&lt;p&gt;This calculates the straight-line distance between two points on a sphere (the Earth) using their Latitude and Longitude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extremely fast:&lt;/strong&gt; Calculating 10,000 pairs takes milliseconds on a CPU.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No external data needed:&lt;/strong&gt; Pure mathematics; zero reliance on APIs or map data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inaccurate:&lt;/strong&gt; It ignores roads, rivers, tunnels, and one-way streets. In urban reality, actual driving distance is usually 1.2x to 1.5x the Haversine distance.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Python Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;haversine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lon1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lon2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;R&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;6371.0&lt;/span&gt; &lt;span class="c1"&gt;# Earth radius in km
&lt;/span&gt;
    &lt;span class="n"&gt;dlat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dlon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lon2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;lon1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dlat&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; 
         &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; 
         &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dlon&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;atan2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="c1"&gt;# Returns km
&lt;/span&gt;
&lt;span class="c1"&gt;# Building the Distance Matrix:
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_haversine_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;haversine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;lng&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;lng&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Practical usage:&lt;/strong&gt; Amazon and Grab often use Haversine as a "Candidate Filter" to eliminate points that are obviously too far away before calling more expensive algorithms.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The Ideal Solution for the Base Problem: Routing Engine (OSRM / GraphHopper)
&lt;/h2&gt;

&lt;p&gt;If your problem &lt;strong&gt;only requires delivering from a fixed warehouse to customers&lt;/strong&gt;, &lt;strong&gt;without caring about real-time traffic&lt;/strong&gt; or &lt;strong&gt;rush hour histories&lt;/strong&gt;, and solely needs distance based on the &lt;strong&gt;actual existing road network&lt;/strong&gt; — then this is the perfect, most cost-effective solution.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prerequisite:&lt;/strong&gt; This section assumes familiarity with &lt;a href="https://dev.to/series/ecommerce-order-allocation/part-3-allocation-algorithms/"&gt;Order Allocation Algorithms&lt;/a&gt; and how a VRP solver consumes a pre-built distance matrix.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You need a &lt;strong&gt;Routing Engine&lt;/strong&gt; to load road network graph data. This data is usually downloaded for free from &lt;strong&gt;OpenStreetMap (OSM)&lt;/strong&gt; — which the community constantly updates whenever a new road is built, an alley is removed, or weight restrictions change. When the roads change, you simply re-download the map file (&lt;code&gt;.osm.pbf&lt;/code&gt;) and restart the engine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not standard Dijkstra or A*?
&lt;/h3&gt;

&lt;p&gt;If you run a standard &lt;strong&gt;Dijkstra&lt;/strong&gt; or &lt;strong&gt;A&lt;/strong&gt;* algorithm to find paths between 10,000 pairs on a city-wide map (which has millions of nodes), your server will freeze because the graph is too large to traverse per query.&lt;/p&gt;

&lt;p&gt;Therefore, modern Routing Engines use "pre-processing" techniques:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Contraction Hierarchies (CH):&lt;/strong&gt; Spends a few hours pre-analyzing the map to create "shortcuts" (virtual highways) between areas. At query time, the algorithm jumps across these shortcuts rather than traversing every alley → Query time drops to &lt;strong&gt;microseconds&lt;/strong&gt;. The downside is that if a road closes, you must rebuild the entire map.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Level Dijkstra (MLD) / Customizable Route Planning (CRP):&lt;/strong&gt; Divides the map into small cells. Updating the cost of a road only requires updating that specific cell (taking seconds) rather than the whole map. Great for real-time traffic injection.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Open-Source Giants: OSRM and GraphHopper Distance Matrix
&lt;/h3&gt;

&lt;p&gt;In custom logistics systems, the two most famous open-source engines are &lt;strong&gt;OSRM&lt;/strong&gt; (C++) and &lt;strong&gt;GraphHopper&lt;/strong&gt; (Java).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;OSRM (Open Source Routing Machine)&lt;/th&gt;
&lt;th&gt;GraphHopper&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Language&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;C++&lt;/td&gt;
&lt;td&gt;Java&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Main Algorithms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MLD and CH&lt;/td&gt;
&lt;td&gt;CH, A*, Landmark&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Routing Profiles&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Written in Lua (driving, cycling, walking)&lt;/td&gt;
&lt;td&gt;Written in Java / YAML&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Matrix API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unbelievably fast (C++ optimized)&lt;/td&gt;
&lt;td&gt;Good, but OSRM edges out on massive matrices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Quite rigid, hard to change rules at runtime&lt;/td&gt;
&lt;td&gt;Highly flexible (Custom Models) allows runtime rule changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Community&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Backed by Mapbox, widely used for static routing&lt;/td&gt;
&lt;td&gt;Easy to embed in Java apps, very flexible&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Routing Profiles:&lt;/strong&gt; &lt;br&gt;
The route for a motorcycle navigating tight alleys is completely different from a 10-ton truck (which is banned from small roads). Both OSRM and GraphHopper allow you to define &lt;strong&gt;Profiles&lt;/strong&gt;. In OSRM, you write Lua files: &lt;code&gt;car.lua&lt;/code&gt;, &lt;code&gt;truck.lua&lt;/code&gt;, &lt;code&gt;bike.lua&lt;/code&gt; to tell the engine which roads are legal.&lt;/p&gt;
&lt;h3&gt;
  
  
  OSRM Table API: Lightning Fast Matrices
&lt;/h3&gt;

&lt;p&gt;Assuming you self-host an OSRM server (very easy via Docker), it provides a &lt;code&gt;table&lt;/code&gt; API that generates distance matrices optimally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Request to local OSRM to calculate a 3x3 matrix&lt;/span&gt;
curl &lt;span class="s2"&gt;"http://localhost:5000/table/v1/driving/106.70,10.77;106.71,10.78;106.72,10.79?annotations=distance,duration"&lt;/span&gt;

&lt;span class="c"&gt;# Returns both durations (seconds) and distances (meters)&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"durations"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;[&lt;/span&gt;0, 150, 320],
    &lt;span class="o"&gt;[&lt;/span&gt;155, 0, 180],
    &lt;span class="o"&gt;[&lt;/span&gt;330, 175, 0]
  &lt;span class="o"&gt;]&lt;/span&gt;,
  &lt;span class="s2"&gt;"distances"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;[&lt;/span&gt;0, 1200, 2400],
    &lt;span class="o"&gt;[&lt;/span&gt;1250, 0, 1100],
    &lt;span class="o"&gt;[&lt;/span&gt;2500, 1150, 0]
  &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: OSRM Table API can return a 100x100 matrix (10,000 elements) in just a few dozen milliseconds!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros of self-hosting:&lt;/strong&gt; Free, insanely fast, zero rate limits.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; RAM heavy (especially if loading an entire country's map).&lt;/p&gt;


&lt;h2&gt;
  
  
  How to Calculate a Distance Matrix with GraphHopper
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GraphHopper&lt;/strong&gt; is the most developer-friendly open-source routing engine for building a distance matrix in Java or via a self-hosted HTTP API. Unlike OSRM, GraphHopper supports runtime routing rule changes via &lt;strong&gt;Custom Models&lt;/strong&gt; — making it the preferred choice when your delivery fleet has vehicle-specific constraints (weight limits, road class restrictions).&lt;/p&gt;
&lt;h3&gt;
  
  
  Option A: GraphHopper Matrix API (Self-hosted)
&lt;/h3&gt;

&lt;p&gt;Self-host the GraphHopper server with Docker and call the &lt;code&gt;/matrix&lt;/code&gt; endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start GraphHopper server with a local OSM map file&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 8989:8989 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/data:/data &lt;span class="se"&gt;\&lt;/span&gt;
  israelhikingmap/graphhopper &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://download.geofabrik.de/asia/vietnam-latest.osm.pbf &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_graphhopper_distance_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Calculate a full GraphHopper distance matrix for a list of lat/lng points.
    Returns duration (seconds) and distance (meters) for every pair.
    Args:
        locations: List of dicts with &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lat&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; and &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lng&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; keys.
    Returns:
        dict with &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;durations&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; and &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distances&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; 2D arrays.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8989/matrix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# GraphHopper Matrix API requires [lng, lat] order (GeoJSON convention)
&lt;/span&gt;    &lt;span class="n"&gt;points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;loc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;points&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;profile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;car&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                     &lt;span class="c1"&gt;# Routing profile: car, bike, foot
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;out_arrays&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;times&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# Request both duration and distance
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fail_fast&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;                     &lt;span class="c1"&gt;# Return partial results if a pair is unreachable
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;durations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;times&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;       &lt;span class="c1"&gt;# N×N matrix in seconds
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;    &lt;span class="c1"&gt;# N×N matrix in meters
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Example: 3 warehouses / delivery points in Ho Chi Minh City
&lt;/span&gt;&lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;10.7712&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;106.7011&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Warehouse
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;10.7780&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;106.7100&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Customer A
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;10.7650&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lng&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;106.6980&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Customer B
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_graphhopper_distance_matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Duration matrix (seconds): &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;durations&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Distance matrix (meters):  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distances&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output:
# Duration matrix (seconds): [[0, 320, 185], [315, 0, 410], [180, 405, 0]]
# Distance matrix (meters):  [[0, 2100, 1350], [2050, 0, 2900], [1300, 2880, 0]]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option B: GraphHopper Java SDK (Embedded)
&lt;/h3&gt;

&lt;p&gt;For Java-based logistics backends, embed GraphHopper directly without a network hop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Embedded GraphHopper distance matrix calculation&lt;/span&gt;
&lt;span class="c1"&gt;// Purpose: Build an NxN distance matrix without requiring a running HTTP server&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.graphhopper.GraphHopper&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.graphhopper.config.CHProfile&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.graphhopper.config.Profile&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.graphhopper.routing.util.EncodingManager&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nc"&gt;GraphHopper&lt;/span&gt; &lt;span class="n"&gt;hopper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GraphHopper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;hopper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setOSMFile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/data/vietnam-latest.osm.pbf"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;hopper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setGraphHopperLocation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/data/graph-cache"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;hopper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setProfiles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Profile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"car"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;setVehicle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"car"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;setWeighting&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fastest"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;hopper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCHPreparationHandler&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;setCHProfiles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CHProfile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"car"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="n"&gt;hopper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;importOrLoad&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Use GHMatrixAPI to compute full NxN matrix&lt;/span&gt;
&lt;span class="c1"&gt;// See: https://github.com/graphhopper/graphhopper/tree/master/web-api&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GraphHopper vs. OSRM: Which to Choose for Distance Matrix?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;GraphHopper Distance Matrix&lt;/th&gt;
&lt;th&gt;OSRM Table API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Speed (large matrices)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast (CH/MLD)&lt;/td&gt;
&lt;td&gt;Slightly faster (C++ optimized)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom vehicle rules&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Runtime Custom Models&lt;/td&gt;
&lt;td&gt;❌ Requires recompile + Lua&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Java ecosystem&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Native SDK&lt;/td&gt;
&lt;td&gt;❌ HTTP only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Docker ease&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Official image&lt;/td&gt;
&lt;td&gt;✅ Official image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OSM data format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.osm.pbf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.osm.pbf&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flexible profiles, Java backends&lt;/td&gt;
&lt;td&gt;Max throughput, static profiles&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; Use &lt;strong&gt;OSRM&lt;/strong&gt; if you have a fixed vehicle type and need maximum raw speed. Use &lt;strong&gt;GraphHopper&lt;/strong&gt; if you need to change routing rules at runtime (truck weight limits, toll avoidance, time-dependent costs).&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Expensive Overkill: Commercial APIs (Google Maps / Mapbox)
&lt;/h2&gt;

&lt;p&gt;If you are building a Ride-Hailing app that requires minute-perfect Estimated Time of Arrival (ETA) so customers don't cancel, you need real-time traffic data and historical rush hour patterns. In that case, you must use commercial APIs (like Google Maps).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;However, for static delivery routing from a fixed warehouse, this is entirely overkill and extremely wasteful.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Calling Google Maps Distance Matrix API (Very Expensive)
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://maps.googleapis.com/maps/api/distancematrix/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;origins&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.77,106.70|10.78,106.71&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;destinations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.77,106.70|10.78,106.71&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;departure_time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Current traffic
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Exorbitant Cost
&lt;/h3&gt;

&lt;p&gt;Google Maps charges about &lt;strong&gt;$0.005&lt;/strong&gt; per Element (Pair).&lt;br&gt;
Returning to our 1 warehouse + 100 orders = 10,201 pairs.&lt;br&gt;
Every time you run the allocation algorithm, you pay: &lt;code&gt;10,201 × $0.005 = $51&lt;/code&gt;.&lt;br&gt;
If your warehouse dispatches 10 batches a day, you lose &lt;strong&gt;$510/day&lt;/strong&gt; just to calculate distances!&lt;/p&gt;

&lt;p&gt;This is exactly why domestic delivery companies self-host &lt;strong&gt;OSRM&lt;/strong&gt; or &lt;strong&gt;GraphHopper&lt;/strong&gt; instead of throwing money at Google Maps for warehouse routing. The OpenStreetMap network is more than adequate for this need.&lt;/p&gt;


&lt;h2&gt;
  
  
  4. System Design Strategies to Save Compute
&lt;/h2&gt;

&lt;p&gt;To maintain high accuracy without overloading servers, large systems use these tricks:&lt;/p&gt;
&lt;h3&gt;
  
  
  Technique 1: Clustering
&lt;/h3&gt;

&lt;p&gt;If 5 orders are going to the same apartment building, cluster them into a single coordinate. A 101x101 matrix drops to 97x97.&lt;/p&gt;
&lt;h3&gt;
  
  
  Technique 2: Hybrid Matrix (Haversine + OSRM)
&lt;/h3&gt;

&lt;p&gt;Only calculate precise road distances for points that could realistically follow one another.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If Point A and Point B are &amp;gt; 15km apart (via Haversine), assign an infinite cost. The VRP solver will naturally never route a vehicle from A to B.&lt;/li&gt;
&lt;li&gt;Only call OSRM for nearby pairs (&amp;lt; 5km).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Technique 3: H3 Hexagon Caching (How Uber/Grab does it)
&lt;/h3&gt;

&lt;p&gt;Recalculating the distance between two nearby alleys day after day is a waste of resources. Logistics giants use Uber's &lt;strong&gt;H3 (Hexagonal Hierarchical Spatial Index)&lt;/strong&gt; to cache distance results.&lt;/p&gt;
&lt;h4&gt;
  
  
  Why Hexagons and not Geohash (Squares)?
&lt;/h4&gt;

&lt;p&gt;In a square grid, the distance from the center to the 4 straight edges differs from the distance to the 4 corners. In a hexagon grid, the distance from the center to all neighboring cells is &lt;strong&gt;exactly equal&lt;/strong&gt;. This is critical in routing because distance error margins are uniformly controlled in all directions.&lt;/p&gt;
&lt;h4&gt;
  
  
  How it works:
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;1. Pick a Resolution:&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;H3 Resolution 9&lt;/strong&gt; is typically used (each hexagon is about 174 meters wide). This perfectly encapsulates a small residential block or a street segment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Convert Coordinates (Lat/Lng) to H3 Index:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;h3&lt;/span&gt;

&lt;span class="c1"&gt;# Convert Point A and Point B
&lt;/span&gt;&lt;span class="n"&gt;h3_A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;geo_to_h3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;10.7712&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;106.7011&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Result: '8965a6a0033ffff'
&lt;/span&gt;&lt;span class="n"&gt;h3_B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;geo_to_h3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;10.7780&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;106.7100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Result: '8965a6a0027ffff'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Check Cache (Redis) before invoking the Engine:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;redis_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;route_cost:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;h3_A&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;h3_B&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;cache_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cache_result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Cache Hit! Zero computation.
&lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Cache Miss! Call OSRM
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_osrm_api&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lon1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lon2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Save to Redis for next time (TTL = 30 days)
&lt;/span&gt;    &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distance_m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="n"&gt;ex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2592000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Pre-warming the Cache
&lt;/h4&gt;

&lt;p&gt;Instead of waiting for a customer to order, the system can run a batch job at night:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch all H3 cells that contain residential areas in the city.&lt;/li&gt;
&lt;li&gt;Use OSRM (since it's free and local) to pre-calculate the distance between all H3 pairs (within 10km of each other).&lt;/li&gt;
&lt;li&gt;Pump this data into Redis.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When the OR-Tools system runs the next day, it reads purely from Redis RAM at the speed of light without needing to traverse any graphs. Because road networks rarely change (only when there's long-term construction), the Cache Hit ratio can exceed &lt;strong&gt;95%&lt;/strong&gt;, giving you the speed of Haversine but the accuracy of a Routing Engine!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Summary of the Series: Order Fulfillment is not just a CRUD application tracking statuses. It is a symphony of event-driven microservices, real-time inventory algorithms, OR-Tools optimization, and ultra-fast Routing Engines processing spatial data.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.uber.com/en-VN/blog/h3/" rel="noopener noreferrer"&gt;Uber Engineering: H3 Hexagonal Hierarchical Spatial Index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mapsplatform.google.com/pricing/" rel="noopener noreferrer"&gt;Google Maps Platform Pricing: Distance Matrix API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://project-osrm.org/" rel="noopener noreferrer"&gt;Open Source Routing Machine (OSRM)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.graphhopper.com/" rel="noopener noreferrer"&gt;GraphHopper Routing Engine Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.graphhopper.com/#tag/Matrix-API" rel="noopener noreferrer"&gt;GraphHopper Matrix API Reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://download.geofabrik.de/asia/vietnam.html" rel="noopener noreferrer"&gt;OpenStreetMap Vietnam Data (Geofabrik)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🔗 &lt;strong&gt;Next Step:&lt;/strong&gt; Now that you understand how to build an accurate distance matrix with GraphHopper, see how this feeds into &lt;a href="https://dev.to/series/ecommerce-order-allocation/part-6-build-mini-allocation-engine/"&gt;Part 6 — Building a Mini Allocation Engine&lt;/a&gt; and the complete &lt;a href="https://dev.to/series/ecommerce-order-allocation/executive-summary/"&gt;Series Executive Summary&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ecommerce-order-allocation/part-7-distance-matrix-routing/" rel="noopener noreferrer"&gt;GraphHopper Distance Matrix: Self-Host OSRM vs Haversine for Route Optimization&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>graphhopper</category>
      <category>routing</category>
    </item>
    <item>
      <title>[System Design] Part 4 — Amazon CONDOR &amp; Anticipatory Shipping</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Thu, 18 Jun 2026 00:02:11 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-part-4-amazon-condor-anticipatory-shipping-35nd</link>
      <guid>https://dev.to/vesviet/system-design-part-4-amazon-condor-anticipatory-shipping-35nd</guid>
      <description>&lt;h2&gt;
  
  
  Amazon Fulfillment: The Three Tiers of Optimization
&lt;/h2&gt;

&lt;p&gt;Amazon processes &lt;strong&gt;billions of orders annually&lt;/strong&gt; through a network of over &lt;strong&gt;175 fulfillment centers&lt;/strong&gt; globally. To maintain their 1-2 day (or same-day) delivery guarantees, they built a 3-tier optimization architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│  TIER 1: ANTICIPATORY SHIPPING (Long-term — weeks/months)  │
│  → ML predicts demand → Moves inventory close to customers │
│     BEFORE they place an order                             │
├─────────────────────────────────────────────────────────────┤
│  TIER 2: REGIONALIZATION (Medium-term — days/weeks)        │
│  → Partitions the fulfillment network into autonomous zones│
│  → Ensures 70-80% of orders are fulfilled intra-region     │
├─────────────────────────────────────────────────────────────┤
│  TIER 3: CONDOR (Short-term — hours)                       │
│  → Continuously re-optimizes the fulfillment plan within   │
│     a 5-6 hour window before pick-and-pack begins.         │
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Anticipatory Shipping — Shipping Before You Buy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A Crazy but Effective Idea
&lt;/h3&gt;

&lt;p&gt;Amazon holds a patent (US Patent 8,615,473) describing a system that &lt;strong&gt;begins shipping items BEFORE a customer places an order&lt;/strong&gt;. It sounds like science fiction, but it's a reality.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Traditional Model:
  Customer orders → Warehouse processes → Ships → Delivered (2-5 days)

Anticipatory Shipping:
  ML predicts: "Customers in Region X will buy 200 iPhone 16s in the next 3 days"
  → Amazon ships 200 iPhones from a central hub to local delivery hubs in Region X
  → Customer places order → The item is already locally staged → Delivered same-day!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ML Model Input Features
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Input Feature&lt;/th&gt;
&lt;th&gt;Significance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Purchase history&lt;/td&gt;
&lt;td&gt;What do they buy, and how often?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browsing behavior&lt;/td&gt;
&lt;td&gt;What are they looking at? Cart abandonment?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wishlists&lt;/td&gt;
&lt;td&gt;Explicitly desired items&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Seasonal patterns&lt;/td&gt;
&lt;td&gt;Winter coats in November, sunscreen in June&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regional demographics&lt;/td&gt;
&lt;td&gt;High-income areas? Young families? College towns?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trending products&lt;/td&gt;
&lt;td&gt;Items going viral on social media&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weather forecast&lt;/td&gt;
&lt;td&gt;Rain forecasted → move umbrellas to local hubs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Events calendar&lt;/td&gt;
&lt;td&gt;Black Friday, Prime Day, major sports events&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Late-Select Addressing — The Key Technique
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The Anticipatory Flow:

1. ML Model: "Zip code 10001 (NY) has an 87% probability of ordering 200 cases of water in the next 3 days."

2. System: Ships 200 cases from the Midwest Central Hub → NY Local Hub.
   These packages DO NOT HAVE A SPECIFIC CUSTOMER ADDRESS YET → They are just labeled "Destination: NY Hub".

3. Customer A in NY orders 2 cases of water:
   → The system assigns an address to 2 of the pre-staged cases at the NY Hub.
   → Delivered in 2 hours!

4. If predictions are slightly off (e.g., 50 cases remain unsold):
   → Amazon might run a targeted flash sale for that zip code.
   → Or re-route them back to the central hub.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  CONDOR — Customer Order and Network Density OptimizeR
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem CONDOR Solves
&lt;/h3&gt;

&lt;p&gt;When you place an order, Amazon doesn't process it immediately. There is a &lt;strong&gt;5-6 hour window&lt;/strong&gt; between the order being placed and the warehouse actually starting the pick-and-pack process. CONDOR exploits this window to optimize delivery routes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;17:00 — Order A is placed in Zone 1
  → CONDOR Plan v1: Ship from WH Alpha, individual truck.

17:15 — Order B is placed in Zone 1 (near Order A)
  → CONDOR Plan v2: Consolidate A+B onto the same route → saves 1 truck trip.

17:30 — Order C is placed in Zone 2 (along the same route)
  → CONDOR Plan v3: Consolidate A+B+C → highly dense, efficient route.

17:45 — Order D is placed in Zone 9 (opposite direction)
  → CONDOR Plan v4: Route 1 (A+B+C) + Route 2 (D only).

→ Every 15 minutes, CONDOR re-evaluates the entire network to find better plans.
→ Deadline: When the window closes (e.g., 23:00), the warehouse executes the final optimized plan.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The CONDOR Algorithm
&lt;/h3&gt;

&lt;p&gt;CONDOR solves a variation of the &lt;strong&gt;Prize Collecting Vehicle Routing Problem (PCVRP)&lt;/strong&gt;, which is vastly more complex than standard VRP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PCVRP:
  Maximize: Total "prize" (value of orders delivered on time)
  Minimize: Total transportation cost
  Subject to:
    - Capacity constraints (vehicle limits)
    - Time windows (delivery SLAs)
    - Fleet size limits
    - Network density bonuses: bundling orders in the same neighborhood significantly reduces cost-per-package.

Solving Techniques:
  1. Mathematical optimization (LP/MIP relaxation)
  2. Local search heuristics (2-opt, 3-opt swaps between routes)
  3. Iterative re-optimization (running the solver continuously as new data arrives)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Major Breakthrough
&lt;/h3&gt;

&lt;p&gt;Amazon has stated that CONDOR reduces the number of feasible routing decisions for a given area from &lt;strong&gt;millions&lt;/strong&gt; to &lt;strong&gt;under 10 viable options&lt;/strong&gt;, transforming an NP-hard problem into something solvable in near real-time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Regionalization — Partitioning the Network
&lt;/h2&gt;

&lt;p&gt;Prior to 2022, if a customer in New York ordered an item, it might have shipped from a warehouse in California (3,000 miles away) if the local warehouse was out of stock. This was incredibly inefficient.&lt;/p&gt;

&lt;p&gt;Amazon restructured its US network into &lt;strong&gt;8 autonomous regions&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Pre-Regionalization:
  Customer in NY → Order fulfilled from CA (3,000 miles) → 3-5 day delivery.

Post-Regionalization:
  Customer in NY → Order fulfilled from NJ or PA (100 miles) → Same/Next day delivery.

Results:
  - Average travel distance per package dropped by ~60%
  - Significant reduction in shipping costs
  - Delivery times dropped by 1-2 days
  - Massive reduction in carbon footprint
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Regional Inventory Strategy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SKU "IPHONE-16-256GB":
  National Demand: 100,000/month

  Northeast Region (NY, NJ, PA):  25,000/mo → Stock 30,000
  West Region (CA, WA, OR):       20,000/mo → Stock 25,000
  South Region (TX, FL, GA):      18,000/mo → Stock 22,000
  ...

  Buffer: 22,000 kept at a central cross-dock facility for overflow/rebalancing.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Comparing Amazon vs. eBay vs. Regional Marketplaces
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Amazon&lt;/th&gt;
&lt;th&gt;eBay&lt;/th&gt;
&lt;th&gt;Regional Marketplaces&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1P + FBA (Owns warehouses)&lt;/td&gt;
&lt;td&gt;Marketplace (Sellers ship)&lt;/td&gt;
&lt;td&gt;Marketplace + Fulfillment (e.g., Shopee)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Facilities&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;175+ Global FCs&lt;/td&gt;
&lt;td&gt;Seller warehouses + 3PLs&lt;/td&gt;
&lt;td&gt;Regional fulfillment hubs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Allocation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CONDOR (Global/Continuous optimization)&lt;/td&gt;
&lt;td&gt;Rule-based (Seller-defined)&lt;/td&gt;
&lt;td&gt;Regional matching engines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Anticipatory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes (Late-Select Addressing)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Structure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8 Autonomous Regions (US)&lt;/td&gt;
&lt;td&gt;Decentralized&lt;/td&gt;
&lt;td&gt;Geographic partitioning&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Lessons Learned for System Design
&lt;/h2&gt;

&lt;p&gt;You don't need to build CONDOR to apply its principles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Batch processing over real-time:&lt;/strong&gt; Don't dispatch an order the second it arrives. Hold it in a 15-30 minute batch window and optimize the entire batch together. It is always mathematically superior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-optimization:&lt;/strong&gt; The best route at 17:00 may be terrible by 17:30 as new orders arrive. Run iterative optimizations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictive placement:&lt;/strong&gt; If data shows consistent regional demand, stage the inventory there beforehand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partitioning:&lt;/strong&gt; Break massive NP-hard routing problems into smaller regional chunks to make them solvable.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Next, we explore split shipments, consolidation, and the last-mile delivery problem—which accounts for 53% of all logistics costs. Read &lt;a href="https://dev.to/series/ecommerce-order-allocation/part-5-split-consolidation-lastmile/"&gt;Part 5 — Split Shipment, Consolidation &amp;amp; Last-Mile Delivery&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://patents.google.com/patent/US8615473B1/en" rel="noopener noreferrer"&gt;US Patent 8,615,473: Method and system for anticipatory package shipping&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.supplychaindive.com/news/amazon-delivery-speed-routing-condor-automation/648753/" rel="noopener noreferrer"&gt;Amazon CONDOR routing optimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.aboutamazon.com/news/company-news/amazon-ceo-andy-jassy-2022-letter-to-shareholders" rel="noopener noreferrer"&gt;Amazon CEO Andy Jassy's 2022 Letter to Shareholders: 8-region network&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ecommerce-order-allocation/part-4-amazon-condor-anticipatory/" rel="noopener noreferrer"&gt;Part 4 — Amazon CONDOR &amp;amp; Anticipatory Shipping&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>ecommerce</category>
      <category>algorithms</category>
    </item>
    <item>
      <title>[System Design] Uber RAMEN: How Ride-Hailing Apps Push Real-Time Notifications to Millions of Devices</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Tue, 16 Jun 2026 23:22:25 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-uber-ramen-how-ride-hailing-apps-push-real-time-notifications-to-millions-of-2156</link>
      <guid>https://dev.to/vesviet/system-design-uber-ramen-how-ride-hailing-apps-push-real-time-notifications-to-millions-of-2156</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Pushing Instant Notifications to Millions of Devices
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Answer-first:&lt;/strong&gt; Uber's RAMEN system maintains persistent gRPC bidirectional streams to every active driver app. When a match is made, the ride offer travels from the matching engine to the driver's phone in under 100ms — without polling. This is how millions of connections are held open simultaneously without crashing the backend.&lt;/p&gt;

&lt;p&gt;When DISCO decides to match you with Driver John Doe, the system must:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Send the ride offer to &lt;strong&gt;exactly&lt;/strong&gt; John Doe's phone (out of millions of connected phones).&lt;/li&gt;
&lt;li&gt;Deliver it in &lt;strong&gt;milliseconds&lt;/strong&gt; (not seconds).&lt;/li&gt;
&lt;li&gt;Ensure the driver &lt;strong&gt;receives it&lt;/strong&gt; even if their 4G connection is weak.&lt;/li&gt;
&lt;li&gt;Simultaneously push the driver's location back to your app so you can watch the car move on the map.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are two main approaches: &lt;strong&gt;Polling&lt;/strong&gt; (asking continuously) and &lt;strong&gt;Push&lt;/strong&gt; (proactively sending).&lt;/p&gt;




&lt;h2&gt;
  
  
  Polling vs. Push
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Polling (The Old Way — Inefficient)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;The Driver App asks the server every 3 seconds: "Do you have any rides for me?"

GET /api/v1/offers?driver_id=abc123
→ Response: { "offers": [] }  ← Nothing here

GET /api/v1/offers?driver_id=abc123  (3 seconds later)
→ Response: { "offers": [] }  ← Still nothing

GET /api/v1/offers?driver_id=abc123  (3 seconds later)
→ Response: { "offers": [...] }  ← Got a ride! But it's already 0-3 seconds delayed

The Issues:
  - 5 million drivers × 1 request/3s = 1.67 million requests/second (just asking)
  - 99% of requests return empty → Massive waste of server resources
  - Latency of 0-3 seconds → Another driver might accept the ride first
  - Battery drain: the radio chip must be constantly active
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Push (RAMEN — Efficient)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The server maintains an OPEN connection with every driver app.
When a ride offer is ready, the server PROACTIVELY pushes it down instantly.

Driver App ◄═══ gRPC Stream (live connection) ═══► RAMEN Server

Advantages:
  - Latency: &amp;lt; 100ms (near-instantaneous)
  - No wasted empty requests
  - The radio chip is only active when there's actual data to receive
  - Significant battery savings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  RAMEN — Real-time Asynchronous Messaging Network
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RAMEN&lt;/strong&gt; is the push messaging infrastructure built in-house by Uber. It maintains millions of concurrent persistent connections to push real-time data to rider and driver apps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three-Tier Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────────────────────────────────────┐
│                     RAMEN Architecture                      │
│                                                              │
│  ┌──────────────────┐                                       │
│  │  Fireball Service │  "When to push?"                     │
│  │  (Decision Engine)│  • Consumes Kafka events              │
│  │                    │  • Evaluates business rules           │
│  │                    │  • Handles priority, localization     │
│  └────────┬─────────┘                                       │
│           │                                                  │
│           ▼                                                  │
│  ┌──────────────────┐                                       │
│  │  API Gateway      │  "What to push?"                     │
│  │  (Payload Builder)│  • Aggregates data from services      │
│  │                    │  • Builds message payloads            │
│  │                    │  • Serializes (Protobuf)              │
│  └────────┬─────────┘                                       │
│           │                                                  │
│           ▼                                                  │
│  ┌──────────────────┐                                       │
│  │  RAMEN Server     │  "How to push?"                      │
│  │  (Delivery Layer) │  • Manages millions of connections    │
│  │                    │  • Routes to the correct device       │
│  │                    │  • Guarantees at-least-once delivery  │
│  └──────────────────┘                                       │
│           │                                                  │
│           ▼                                                  │
│     Mobile Devices (Millions)                                │
└────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Evolution of the Transport Protocol
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Generation 1: Server-Sent Events (SSE) over HTTP/1.1
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;Client ──── HTTP/1.1 Connection ────► RAMEN Server
       ◄═══ SSE (Server → Client only) ═══

Characteristics:
  ✓ Simple, browser-friendly
  ✗ Unidirectional: only Server → Client
  ✗ ACK (acknowledgments) must be sent via separate HTTP POST requests (wasting connections)
  ✗ Head-of-line blocking: large messages block smaller heartbeats
  ✗ Text-based JSON: heavy payloads
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Generation 2: gRPC over QUIC/HTTP3 (Current)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client ◄══ gRPC Bidirectional Stream ══► RAMEN Server
              (HTTP/3 + QUIC)

Characteristics:
  ✓ Full-duplex: both sides send data simultaneously on the same connection
  ✓ Binary framing (Protobuf): small payloads, low CPU usage
  ✓ Multiplexing: multiple streams on 1 connection, no head-of-line blocking
  ✓ QUIC: UDP-based, much more resilient on weak mobile networks
  ✓ ACKs sent right on the current stream (no separate connection needed)
  ✓ Connection migration: switching networks (WiFi → 4G) doesn't break the connection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Scalability: Managing Millions of Connections
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Apache Helix + ZooKeeper
&lt;/h3&gt;

&lt;p&gt;RAMEN cannot run on a single server — it requires a cluster of hundreds of servers, each holding tens of thousands of live connections.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RAMEN Cluster Management:

ZooKeeper: Stores topology metadata (which servers are alive?)
Apache Helix: Manages sharding and automatic rebalancing

User UUID "abc123" → hash() → Shard 42 → RAMEN Server #7
User UUID "def456" → hash() → Shard 18 → RAMEN Server #3

If server #7 dies:
  1. Helix detects it (heartbeat timeout)
  2. Helix moves Shard 42 to server #9
  3. Client "abc123" reconnects → Load Balancer → server #9
  4. Continues receiving messages seamlessly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Stateful Servers — The Unique Challenge
&lt;/h3&gt;

&lt;p&gt;Unlike standard REST API servers (which are stateless), RAMEN servers are &lt;strong&gt;stateful&lt;/strong&gt; — each server holds specific TCP/gRPC sockets for specific users. Routing must be exact: a message for user "abc123" must reach the &lt;strong&gt;exact server&lt;/strong&gt; holding that user's socket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Routing Flow:

Fireball: "Push ride offer to driver abc123"
  │
  ▼
Routing Layer: hash("abc123") → Server #7
  │
  ▼
Server #7: Finds the socket for abc123 in memory → Pushes the message
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Ensuring Reliability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  At-Least-Once Delivery
&lt;/h3&gt;

&lt;p&gt;Mobile networks are notoriously unreliable — 4G signals can drop for a few seconds and return. RAMEN guarantees that a message is delivered &lt;strong&gt;at least once&lt;/strong&gt; by using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Persistence Layer:

  Cassandra (Durable Storage)     Redis (In-Memory Cache)
  ┌────────────────────┐         ┌────────────────────┐
  │ Source of truth     │         │ Absorb traffic     │
  │ Stores messages    │◄───────│ bursts             │
  │ permanently for     │         │ Thundering herd    │
  │ retries             │         │ protection         │
  └────────────────────┘         └────────────────────┘

Flow:
  1. Message arrives → Written to Cassandra (backup)
  2. Cached in Redis (fast access)
  3. Pushed via gRPC stream to the device
  4. Device sends an ACK
  5. If no ACK is received within 10s → Retry from Cassandra
  6. Receives ACK → Marks as delivered
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sequence Numbers — Handling Reconnects
&lt;/h3&gt;

&lt;p&gt;When a client loses its connection and reconnects, how does the server know which messages it has already received?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Every message has an incrementing sequence number:

Server → Client:
  [seq=1] Ride offer
  [seq=2] ETA update
  [seq=3] Driver location  ← Connection lost here

Client reconnects:
  "Last received: seq=2"

Server:
  → Resends everything from seq=3 onwards (doesn't resend seq=1, seq=2)
  [seq=3] Driver location (retry)
  [seq=4] Driver location (new)
  ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Connection Draining — Preventing Thundering Herds
&lt;/h2&gt;

&lt;p&gt;When deploying new code or scaling down a cluster, RAMEN cannot simply drop millions of connections at once — this would cause a &lt;strong&gt;thundering herd&lt;/strong&gt; (millions of clients reconnecting simultaneously, crashing the entire system).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Graceful&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Shutdown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flow:&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;RAMEN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;needs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;shut&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;down&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;stops&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;accepting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;NEW&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;connections&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Sends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Graceful Disconnect"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;all&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;currently&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;connected&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;clients&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;Message:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;type:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DISCONNECT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;backoff_hint_ms:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;random(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Each&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;client&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;receives&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;different&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;backoff_hint:&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Client&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;waits&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seconds,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;reconnects&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Client&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;B:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;waits&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;15.7&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seconds,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;reconnects&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Client&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;C:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;waits&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;8.1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seconds,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;reconnects&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Reconnections&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;are&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spread&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evenly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;over&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seconds&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;no&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;thundering&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;herd&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Fallback: Silent Push Notifications
&lt;/h2&gt;

&lt;p&gt;When a driver app is pushed to the background by the operating system (due to Android/iOS power management), the gRPC stream will be closed. In this scenario, RAMEN uses &lt;strong&gt;Silent Push Notifications&lt;/strong&gt; via APNs (Apple) / FCM (Google) to "wake up" the app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Primary Flow (App in foreground):
  RAMEN Server ══► gRPC Stream ══► Driver App  ✓ (&amp;lt; 100ms)

Fallback Flow (App in background):
  RAMEN Server ──► FCM/APNs ──► OS wakes app ──► App reconnects gRPC
                                                    (Takes 1-5 seconds)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Grab's Approach: WebSocket + Istio
&lt;/h2&gt;

&lt;p&gt;Grab didn't build a massive custom system like RAMEN; they use a simpler approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WebSockets&lt;/strong&gt; for real-time bidirectional communication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Istio Service Mesh&lt;/strong&gt; manages routing, load balancing, and mTLS.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FCM/APNs&lt;/strong&gt; act as fallbacks when the app is backgrounded.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pros: Simpler to maintain, uses open standards. Cons: WebSockets over HTTP/1.1 suffer from head-of-line blocking and aren't as robust as gRPC/QUIC on weak networks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary of the End-to-End Real-Time Flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; THE COMPLETE REAL-TIME PIPELINE:

 ① Driver moves
    → GPS Sensor → Kalman Filter → Batches 3 points
    → gRPC Stream → Load Balancer → Location Service

 ② Location Service
    → Converts GPS → H3 Index
    → Produces to Kafka "driver.location.updates"

 ③ Kafka → Consumers:
    ├── Redis GEO (updates driver's location)
    ├── Flink (calculates supply-demand → Surge Pricing)
    └── Analytics Pipeline (Data Lake)

 ④ Rider requests a car
    → Demand Service → Kafka "ride.requests"

 ⑤ DISCO Matching Engine
    → Queries Redis (finds nearby drivers)
    → Routing Service (calculates actual ETA)
    → Hungarian Algorithm (batched matching)
    → Selects the optimal driver

 ⑥ RAMEN Push
    → Fireball (decision) → API Gateway (payload)
    → RAMEN Server → gRPC Stream → Driver Phone
    → "You have a new ride request!"

 ⑦ Driver accepts
    → Trip Service → Kafka "ride.status.changes"
    → RAMEN Push → Rider App: "Your driver is on the way!"
    → Location stream starts pushing driver's position to the Rider App
    → Car moves smoothly on the map 🚗

 Total elapsed time: &amp;lt; 2 seconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Official Reference Sources
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://eng.uber.com/h3/" rel="noopener noreferrer"&gt;Uber Eng: H3 Hexagonal Index&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Hexagonal gridding algorithm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://eng.uber.com/real-time-push-platform/" rel="noopener noreferrer"&gt;Uber Eng: RAMEN&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Push messaging architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://eng.uber.com/disco/" rel="noopener noreferrer"&gt;Uber Eng: DISCO&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Matching Engine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://eng.uber.com/deepeta/" rel="noopener noreferrer"&gt;Uber Eng: DeepETA&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;ML model for ETA prediction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://engineering.grab.com/" rel="noopener noreferrer"&gt;Grab Eng: Fulfilment Platform&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Dispatch platform architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://engineering.grab.com/" rel="noopener noreferrer"&gt;Grab Eng: DispatchGym&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;RL framework for dispatch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://eng.lyft.com/" rel="noopener noreferrer"&gt;Lyft Eng: Real-time Map Matching&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Kalman Filters &amp;amp; Map Matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://h3geo.org/" rel="noopener noreferrer"&gt;H3 Documentation&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;API reference for H3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://s2geometry.io/" rel="noopener noreferrer"&gt;Google S2 Geometry&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;API reference for S2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Congratulations on completing the series! You now deeply understand every architectural layer behind that smoothly moving car on your map. From the GPS sensor → Kalman Filter → Kafka → H3 → DISCO → RAMEN → Your App. **Every layer represents a fascinating distributed engineering challenge.&lt;/em&gt;**&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ride-hailing-realtime-architecture/part-6-realtime-push-ramen/" rel="noopener noreferrer"&gt;Uber RAMEN: How Ride-Hailing Apps Push Real-Time Notifications to Millions of Devices&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>websockets</category>
      <category>go</category>
    </item>
    <item>
      <title>[System Design] Surge Pricing Algorithm: How Ride-Hailing Engines Calculate Surge Rate in Real Time</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Tue, 16 Jun 2026 12:02:51 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-surge-pricing-algorithm-how-ride-hailing-engines-calculate-surge-rate-in-real-time-li8</link>
      <guid>https://dev.to/vesviet/system-design-surge-pricing-algorithm-how-ride-hailing-engines-calculate-surge-rate-in-real-time-li8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series context:&lt;/strong&gt; This is Part 5 of the &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/"&gt;Real-Time Ride-Hailing Architecture&lt;/a&gt; series. For location ingestion and geospatial indexing, start at &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-1-location-ingestion/"&gt;Part 1&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What Is Surge Rate? (And How Is It Calculated?)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Surge rate&lt;/strong&gt; is the real-time price multiplier (e.g., 2.0×) applied by ride-hailing platforms when ride demand in a geographic zone exceeds available driver supply. It is recalculated every 30–60 seconds per H3 hexagon cell using a demand/supply ratio fed into a lookup table or ML model.&lt;/p&gt;

&lt;p&gt;{{&amp;lt; faq q="What is surge rate?" &amp;gt;}}&lt;br&gt;
Surge rate (also called surge pricing or surge multiplier) is the real-time price multiplier that ride-hailing platforms like Uber and Grab apply when demand for rides exceeds the available supply of drivers in a geographic zone. A surge rate of 2.0x means the rider pays twice the base fare.&lt;br&gt;
{{&amp;lt; /faq &amp;gt;}}&lt;/p&gt;

&lt;p&gt;{{&amp;lt; faq q="How is surge rate calculated?" &amp;gt;}}&lt;br&gt;
The surge rate is calculated by a pricing engine that evaluates the ratio of incoming ride requests (demand) versus available drivers (supply) in a specific H3 hexagon cell over a rolling time window (typically 5 minutes). The ratio is fed into a lookup table or ML model that outputs the surge multiplier.&lt;br&gt;
{{&amp;lt; /faq &amp;gt;}}&lt;/p&gt;
&lt;h2&gt;
  
  
  Why is Surge Pricing Necessary?
&lt;/h2&gt;

&lt;p&gt;On New Year's Eve, during heavy rain, or at rush hour — the demand for rides skyrockets, but the number of available drivers remains unchanged. If prices were kept fixed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Riders wouldn't be able to book a ride because there are no available drivers.&lt;/li&gt;
&lt;li&gt;Drivers in other areas would have no incentive to move to the hot zones.&lt;/li&gt;
&lt;li&gt;The system would be overwhelmed, leading to massive wait times.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Surge Pricing&lt;/strong&gt; (or Dynamic Pricing) is not merely a tool to increase revenue — it is a &lt;strong&gt;marketplace equilibrium mechanism&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Price increases → Two simultaneous effects:

1. SUPPLY INCREASES: Drivers see red zones (high prices) on their heatmap
                     → They move toward those areas to earn more
                     → The number of available drivers in the area increases

2. DEMAND DECREASES: Riders see high prices → Some choose to wait, take a bus,
                     or walk → The number of ride requests drops

→ Supply and demand gradually return to EQUILIBRIUM
→ Wait times are reduced for riders who truly need a car
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Surge Pricing Engine Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────────────────────────────────────────┐
│                      DATA PIPELINE                              │
│                                                                  │
│  Kafka Topic              Flink Stream Processing                │
│  "driver.location"  ───►  ┌────────────────────┐                │
│  "ride.requests"    ───►  │  Supply-Demand      │                │
│                           │  Aggregator         │                │
│                           │  (per H3 cell,      │                │
│                           │   5-min window)     │                │
│                           └─────────┬──────────┘                │
│                                     │                            │
│                                     ▼                            │
│                           ┌────────────────────┐                │
│                           │  Pricing Engine     │                │
│                           │  (Surge Calculator) │                │
│                           └─────────┬──────────┘                │
│                                     │                            │
│                           ┌─────────▼──────────┐                │
│                           │  Redis Cache        │                │
│                           │  (Surge Multipliers) │                │
│                           └─────────┬──────────┘                │
│                                     │                            │
│                    ┌────────────────┼────────────────┐           │
│                    ▼                ▼                ▼           │
│              Rider App        Driver App       Matching Engine   │
│              (Shows price)    (Heatmap)        (Weighs cost)    │
└────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1: Geofencing with H3
&lt;/h2&gt;

&lt;p&gt;Surge pricing is not calculated for an entire city — it is calculated for &lt;strong&gt;individual H3 hexagons&lt;/strong&gt;. Uber uses Resolution 7 (each cell ~5 km²), which is large enough to be statistically significant but small enough to reflect hyper-local conditions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ho Chi Minh City is divided into ~200 H3 cells (Resolution 7)

Cell A (District 1 - Center): Supply=5,  Demand=30  → Surge 3.2x
Cell B (District 7 - Suburb): Supply=20, Demand=15  → Surge 1.0x (normal)
Cell C (Airport):             Supply=8,  Demand=40  → Surge 4.0x
Cell D (District 9 - Outskirts): Supply=12, Demand=3   → Surge 1.0x
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: Calculating the Surge Multiplier
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Basic Model: Supply-Demand Ratio and Allocation Algorithms
&lt;/h3&gt;

&lt;p&gt;Just like in e-commerce &lt;a href="https://dev.to/series/ecommerce-order-allocation/part-3-allocation-algorithms/"&gt;allocation algorithms&lt;/a&gt; that decide which warehouse fulfills an order, the surge engine evaluates resources dynamically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;surge_multiplier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;demand&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;supply&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;Where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;supply&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Number&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;AVAILABLE&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;H3&lt;/span&gt; &lt;span class="nf"&gt;cell &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;demand&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Number&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;ride&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;H3&lt;/span&gt; &lt;span class="nf"&gt;cell &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;mins&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;Example&lt;/span&gt; &lt;span class="n"&gt;simple&lt;/span&gt; &lt;span class="nf"&gt;formula &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;illustrative&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;demand&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;supply&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;surge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normal&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;surge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.5&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;surge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;surge&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;3.5&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maximum&lt;/span&gt; &lt;span class="n"&gt;cap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Advanced Model: Machine Learning
&lt;/h3&gt;

&lt;p&gt;In reality, Uber doesn't use a simple linear formula. They use &lt;strong&gt;ML models&lt;/strong&gt; to calculate optimal prices based on a multitude of factors:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Input Feature&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Supply count&lt;/td&gt;
&lt;td&gt;Number of idle drivers in the cell&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Demand count&lt;/td&gt;
&lt;td&gt;Number of requests in a sliding window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Historical patterns&lt;/td&gt;
&lt;td&gt;Supply-demand patterns by hour/day of the week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weather data&lt;/td&gt;
&lt;td&gt;Raining → demand rises, supply drops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Events&lt;/td&gt;
&lt;td&gt;Large events (concerts, football games)?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversion rate&lt;/td&gt;
&lt;td&gt;What % of riders still book at the current surge price?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Neighboring cells&lt;/td&gt;
&lt;td&gt;Surge levels in adjacent cells (spillover effect)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Feedback Loop — Preventing Over-Pricing
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Continuous feedback loop:

1. Surge = 3.0x → Many riders cancel (conversion rate drops from 70% → 30%)
2. Engine realizes: price is too high, riders are abandoning
3. Lowers surge to 2.0x → Conversion rate recovers to 60%
4. Simultaneously, drivers arrive (supply increases) → ratio drops
5. Surge continues dropping to 1.5x → 1.0x

This entire process happens automatically over a few minutes.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: The Driver Heatmap
&lt;/h2&gt;

&lt;p&gt;Surge pricing doesn't just affect the cost for the rider — it generates a &lt;strong&gt;Heatmap&lt;/strong&gt; displayed on the driver's app, guiding them to areas with high demand.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Heatmap Visualization:

  ┌────────────────────────────────────┐
  │                                    │
  │      🟢          🟡               │
  │            🟢         🟡          │
  │      🟢   District 7     🔴       │
  │            🟢    🟡  🔴 District 1│
  │      🟢        🟡  🔴 🔴         │
  │                   🔴              │
  │               🟡                  │
  │                                    │
  └────────────────────────────────────┘

  🟢 = 1.0x (normal, surplus of drivers)
  🟡 = 1.5-2.0x (moderate demand)
  🔴 = 2.5x+ (very high demand, great earning potential)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Real-time Heatmap Updates
&lt;/h3&gt;

&lt;p&gt;The heatmap is pushed to the driver app via &lt;strong&gt;WebSockets&lt;/strong&gt; (or gRPC streams):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;WebSocket&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Push&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Driver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;App&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Payload&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;every&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seconds:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"heatmap"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"h3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"872a100d6ffffff"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"surge"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"color"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#FF0000"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"h3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"872a100d7ffffff"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"surge"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"color"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#00FF00"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"h3"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"872a100d8ffffff"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"surge"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"color"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"#FFAA00"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"updated_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-05-06T20:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Predictive Surge — Anticipating Demand
&lt;/h2&gt;

&lt;p&gt;Uber and Grab don't just react to current surges — they &lt;strong&gt;predict surges&lt;/strong&gt; before they happen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Predictive Model:

Inputs:
  - Current time: 17:00 (rush hour approaching)
  - Day: Friday (weekend → demand rises)
  - Weather: Rain forecasted at 17:30
  - Events: Concert at the Stadium at 20:00
  - History: The last 4 Fridays also surged to 2.5x at 17:30

Output:
  - Prediction: Surge will hit 2.8x in the Stadium area at 17:30
  - Action: Send notifications to nearby drivers 15 minutes BEFORE
    "High demand expected near the Stadium soon, drive there to earn more!"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Upfront Pricing vs. Surge Multiplier
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Old Model: Surge Multiplier (Uber before 2017)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Price shown to rider: "Surge 2.5x"
Final Price = Base Fare × 2.5
Problem: Riders didn't know the total cost before getting in → Surprises, complaints
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The New Model: Upfront Pricing (Current Uber, Grab)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Shown to rider: "Price: 125,000 VND" (fixed before booking)

Price is calculated from:
  base_fare + (distance × per_km_rate) + (time × per_min_rate)
  + surge_premium
  + route_specific_adjustments (e.g., predicted traffic jams)

The rider knows the exact price upfront → Much more transparent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Storing the Surge State
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Redis: Stores the surge multiplier for each H3 cell
-- Key pattern: surge:{resolution}:{h3_cell_id}
-- TTL: 60 seconds (auto-expires if not updated → falls back to 1.0x)

SET surge:7:872a100d6ffffff "3.2" EX 60
SET surge:7:872a100d7ffffff "1.0" EX 60
SET surge:7:872a100d8ffffff "1.8" EX 60

-- When Rider App requests a price:
GET surge:7:872a100d6ffffff → "3.2"
-- The API Gateway uses this value to calculate the Upfront Price
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Anti-Abuse Mechanisms
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Drivers deliberately turning off apps to create artificial scarcity&lt;/td&gt;
&lt;td&gt;Detect patterns: many drivers going offline simultaneously → flag&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drivers only accepting high surge rides, rejecting normal rides&lt;/td&gt;
&lt;td&gt;Low acceptance rate → lower priority in matching algorithm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extremely high surges causing massive backlash&lt;/td&gt;
&lt;td&gt;Maximum cap (e.g., 5.0x), soft caps based on conversion rates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"Flickering" surge (rapidly fluctuating prices)&lt;/td&gt;
&lt;td&gt;Smoothing: surge can only increase/decrease by a max of 0.5x every 30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Surge Rate FAQ: Common Questions Answered
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is surge rate in ride-hailing?&lt;/strong&gt;&lt;br&gt;
Surge rate is the multiplier applied to a ride's base price during peak demand periods. A surge rate of 1.5x on a $10 base fare means the rider pays $15. The surge rate is calculated per geographic zone (H3 cell) and updated every 30–60 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why does surge rate exist?&lt;/strong&gt;&lt;br&gt;
Surge rate is a market-clearing mechanism, not just a revenue tool. When demand outpaces supply, a higher surge rate simultaneously attracts more drivers to the area (supply increase) and filters out lower-urgency ride requests (demand decrease), restoring equilibrium and reducing wait times for riders who genuinely need a car.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is a normal surge rate vs. a high surge rate?&lt;/strong&gt;&lt;br&gt;
A surge rate of 1.0x is the baseline (no surge). Rates between 1.2x–1.8x are considered moderate surge. Rates above 2.0x indicate heavy demand imbalance. Most platforms enforce a hard cap (e.g., 5.0x) to prevent extreme price spikes that damage user trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How long does a surge rate last?&lt;/strong&gt;&lt;br&gt;
Surge rates are recalculated every 30–60 seconds using a sliding window over the last 5 minutes of supply-demand data. In most cases, a surge event lasts 15–45 minutes before drivers repositioning to the zone restore equilibrium.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does the surge pricing engine work technically?&lt;/strong&gt;&lt;br&gt;
The engine ingests driver location events and ride request events from a message broker (Kafka). A stream processor (Apache Flink) aggregates supply and demand counts per H3 cell on a 5-minute tumbling window. The output is a demand/supply ratio that maps to a surge multiplier via a lookup table or an ML model. The resulting multiplier is cached in Redis with a 60-second TTL and read by the API gateway at price-calculation time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In the final part, we will explore RAMEN — Uber's real-time communication infrastructure, which solves the problem of pushing instant notifications to millions of devices simultaneously. Continue reading &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-6-realtime-push-ramen/"&gt;Part 6 — RAMEN &amp;amp; Real-time Communication&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ride-hailing-realtime-architecture/part-5-pricing-surge-engine/" rel="noopener noreferrer"&gt;Surge Pricing Algorithm: How Ride-Hailing Engines Calculate Surge Rate in Real Time&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>algorithms</category>
      <category>realtimemicroservices</category>
    </item>
    <item>
      <title>[System Design] Ride-Hailing Dispatch Algorithm: How Uber DISCO &amp; Grab DispatchGym Match Drivers</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Mon, 15 Jun 2026 00:01:48 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-ride-hailing-dispatch-algorithm-how-uber-disco-grab-dispatchgym-match-drivers-pp4</link>
      <guid>https://dev.to/vesviet/system-design-ride-hailing-dispatch-algorithm-how-uber-disco-grab-dispatchgym-match-drivers-pp4</guid>
      <description>&lt;p&gt;Every time you tap "Book Ride," a system makes dozens of decisions in under two seconds: Which driver? What route? What's the real ETA? This article breaks down exactly how the &lt;strong&gt;dispatch algorithm&lt;/strong&gt; works — from the greedy approach that fails at scale, to the bipartite graphs, batched matching, and &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-5-pricing-surge-engine/"&gt;surge pricing&lt;/a&gt; mechanics that power Uber, Lyft, Grab, and Gojek today.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why a Greedy Dispatch Algorithm Fails (Closest Driver Problem)
&lt;/h2&gt;

&lt;p&gt;The first instinct when designing a matching system is to pair every customer with their nearest driver. However, this &lt;strong&gt;Greedy&lt;/strong&gt; approach causes massive losses at a system-wide scale:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Example: 3 riders (R1, R2, R3) and 3 drivers (D1, D2, D3)

Greedy Matching (closest driver):
  R1 ← D1 (ETA 2 mins)  ✓
  R2 ← D3 (ETA 8 mins)  ← D2 was "taken" by R1, even though D2 is closer to R2
  R3 ← D2 (ETA 10 mins) ← Terrible outcome

  Total ETA: 2 + 8 + 10 = 20 minutes

Optimal Matching (global optimal):
  R1 ← D2 (ETA 3 mins)
  R2 ← D1 (ETA 3 mins)
  R3 ← D3 (ETA 4 mins)

  Total ETA: 3 + 3 + 4 = 10 minutes  ← 50% better!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uber refers to this problem as &lt;strong&gt;Global Optimization&lt;/strong&gt; — finding an assignment strategy that minimizes the &lt;strong&gt;total ETA of the entire system&lt;/strong&gt;, rather than optimizing just for individual pairs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bipartite Graph Matching: The Mathematical Foundation (Lyft)
&lt;/h2&gt;

&lt;p&gt;Before diving into the systems, it helps to understand the &lt;strong&gt;mathematical model&lt;/strong&gt; that all ride-hailing matching engines share at their core.&lt;/p&gt;

&lt;p&gt;Lyft formalizes dispatch as a &lt;strong&gt;bipartite graph matching problem&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bipartite Graph:
  Set A (Riders):  { R1, R2, R3, R4 }
  Set B (Drivers): { D1, D2, D3, D4, D5 }

  Edges: every possible Rider ↔ Driver pair
  Edge Weight: cost of that match (e.g., ETA, driver rating, distance)

  Goal: Find a set of edges (a "matching") where:
    - No rider is matched to more than one driver
    - No driver is matched to more than one rider
    - The total cost of all selected edges is minimized
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is known as the &lt;strong&gt;Minimum Weight Bipartite Matching&lt;/strong&gt; problem. The classical algorithm for solving it is the &lt;strong&gt;Hungarian Algorithm&lt;/strong&gt; (also called the Kuhn-Munkres algorithm), which runs in &lt;strong&gt;O(n³)&lt;/strong&gt; time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Batching Matters for Bipartite Matching
&lt;/h3&gt;

&lt;p&gt;The key insight is that you can only find a globally optimal bipartite matching if you have &lt;strong&gt;multiple riders and drivers available simultaneously&lt;/strong&gt;. If you match greedily (one-by-one as requests arrive), you lose the ability to find the global optimum.&lt;/p&gt;

&lt;p&gt;This is why all major ride-hailing platforms introduce a &lt;strong&gt;batching window&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Batching Strategy:
  1. Collect all ride requests in a 2-5 second window
  2. Build a complete Rider × Driver cost matrix
  3. Run the Hungarian Algorithm on the full batch
  4. Dispatch all assignments simultaneously

Result: System-wide optimal — not just locally optimal for each individual request
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Global Optimality&lt;/th&gt;
&lt;th&gt;Scalability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pure Greedy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Near-zero&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Batched Matching&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2-5 seconds&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RL-Adaptive Batching&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dynamic&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Uber DISCO: The Core Dispatch Algorithm Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;DISCO&lt;/strong&gt; (Dispatch Optimization) is Uber's matching system, responsible for pairing millions of ride requests with drivers every day.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overall Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐     ┌─────────────────┐
│  Rider App      │     │  Driver App      │
│  "Book Ride"    │     │  GPS, Status     │
└────────┬────────┘     └────────┬────────┘
         │                       │
         ▼                       ▼
┌─────────────────┐     ┌─────────────────┐
│  Demand Service │     │  Supply Service  │
│  (Ride Requests)│     │  (Driver Pool)   │
└────────┬────────┘     └────────┬────────┘
         │                       │
         └──────────┬────────────┘
                    ▼
         ┌─────────────────────┐
         │    DISCO Engine      │
         │                     │
         │  1. Candidate Filter │ ← S2/H3 Geospatial Query
         │  2. ETA Calculator   │ ← Routing Service + DeepETA
         │  3. Batch Optimizer  │ ← Hungarian Algorithm
         │  4. Dispatch         │ ← RAMEN Push
         └─────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 1: Candidate Filtering
&lt;/h3&gt;

&lt;p&gt;When a ride request arrives, DISCO doesn't check every driver. It uses the rider's &lt;strong&gt;&lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-2-geospatial-indexing/"&gt;S2 Cell ID&lt;/a&gt;&lt;/strong&gt; (or H3) to rapidly narrow down the list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input:  Rider location → S2 Cell Level 12
Action: Find all neighboring S2 cells (covering circle radius ~3km)
Query:  Redis/Memory → Retrieve list of drivers in those cells

Filters:
  ✓ Status = AVAILABLE (not currently carrying a passenger)
  ✓ Vehicle type matches (Car, Bike, SUV, etc.)
  ✓ Capacity is sufficient (if a group ride)
  ✓ Not blocked by the rider

Output: ~10-30 candidate drivers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  S2 vs H3: Choosing the Right Spatial Index
&lt;/h3&gt;

&lt;p&gt;Uber originally used &lt;strong&gt;S2 geometry&lt;/strong&gt; (quad-tree squares) for spatial sharding but later developed &lt;strong&gt;H3&lt;/strong&gt; (hexagons). The difference matters for how the engine finds candidate drivers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;S2 Geometry (Google)&lt;/th&gt;
&lt;th&gt;H3 (Uber)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cell shape&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Square&lt;/td&gt;
&lt;td&gt;Hexagon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Neighbor equidistance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (diagonal cells are further)&lt;/td&gt;
&lt;td&gt;Yes (all 6 neighbors equidistant)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Precise geofencing, hierarchical sharding&lt;/td&gt;
&lt;td&gt;Proximity search, surge heatmaps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Used by&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lyft, early Uber DISCO&lt;/td&gt;
&lt;td&gt;Modern Uber, Grab&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: ETA Calculation with DeepETA
&lt;/h3&gt;

&lt;p&gt;Straight-line (crow-fly) distance is meaningless in the real world — 500m straight-line might be a 3km drive (due to bridges, intersections, or one-way streets).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Candidate drivers: [D1, D2, D3, D4, D5]
Rider position: R1

Routing Service calculates actual ETA based on:
  - Digitized road network graph
  - Real-time traffic data
  - One-way streets, overpasses, tunnels
  - Rush hours / historical traffic patterns

Results:
  D1: 800m crow-fly → 2.3km road → ETA 4 mins (heavy traffic)
  D2: 1.2km crow-fly → 1.5km road → ETA 3 mins (clear roads)  ← Better!
  D3: 600m crow-fly → 3.8km road → ETA 7 mins (must U-turn)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uber developed &lt;strong&gt;DeepETA&lt;/strong&gt; — a hybrid deep learning approach that sits &lt;strong&gt;on top of&lt;/strong&gt; the traditional routing engine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DeepETA Architecture:
  1. Traditional Router → "Naive ETA" (e.g., 4 mins based on road graph)
  2. Deep Neural Network → "Residual Prediction" (e.g., +1.5 min due to
     traffic signal, weather, specific intersection complexity)
  3. Final ETA = Naive ETA + Residual

Input features to the DNN:
  - GPS coordinates (cleaned by Kalman Filter)
  - Time of day, day of week
  - Historical traffic at this location
  - Driver/vehicle attributes
  - Trip type (airport, downtown, etc.)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Kalman Filter role:&lt;/strong&gt; Raw GPS signals are noisy in urban environments (tall buildings, tunnels). A Kalman Filter smooths the GPS stream before it feeds into DeepETA, ensuring the model learns from accurate positional data rather than jittery raw coordinates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 3: Batched Matching
&lt;/h3&gt;

&lt;p&gt;This is the most crucial and complex step. Instead of processing each request individually, DISCO &lt;strong&gt;batches requests&lt;/strong&gt; within a short time window (a few seconds) and solves the optimal assignment problem for the entire batch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Batching Window: 2 seconds

Within 2 seconds, the system receives:
  Ride Requests: [R1, R2, R3, R4, R5]
  Available Drivers: [D1, D2, D3, D4, D5, D6, D7]

Build Cost Matrix (ETA between every rider-driver pair):

         D1   D2   D3   D4   D5   D6   D7
  R1  [  3    5    8    2    7    9    4  ]
  R2  [  6    2    4    7    3    5    8  ]
  R3  [  4    7    3    5    6    2    9  ]
  R4  [  8    4    6    3    5    7    2  ]
  R5  [  5    6    2    8    4    3    7  ]

The Problem: Find a 1-to-1 assignment such that the total cost is minimized.
→ Hungarian Algorithm (or Auction Algorithm)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Hungarian Algorithm
&lt;/h3&gt;

&lt;p&gt;This is a classic algorithm used to solve the &lt;strong&gt;Assignment Problem&lt;/strong&gt; with a time complexity of &lt;strong&gt;O(n³)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: N×M Cost Matrix (N riders, M drivers, M ≥ N)
Output: Optimal 1-to-1 assignment

Optimal Results:
  R1 ← D4 (ETA 2 mins)
  R2 ← D2 (ETA 2 mins)
  R3 ← D6 (ETA 2 mins)
  R4 ← D7 (ETA 2 mins)
  R5 ← D3 (ETA 2 mins)

  Total ETA: 10 minutes (compared to a Greedy approach which might yield 20+ minutes)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Ringpop — Distributed Coordination
&lt;/h2&gt;

&lt;p&gt;DISCO runs on multiple servers. How do we ensure two different DISCO servers don't attempt to assign the exact same driver to two different riders?&lt;/p&gt;

&lt;p&gt;Uber developed &lt;strong&gt;Ringpop&lt;/strong&gt; — a consistent hashing library based on the &lt;strong&gt;SWIM&lt;/strong&gt; (gossip) protocol:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ringpop Hash Ring:

Server A ────── Server B ────── Server C ────── Server A
   │                │                │
   ▼                ▼                ▼
  Riders &amp;amp;       Riders &amp;amp;         Riders &amp;amp;
  Drivers        Drivers          Drivers
  in Zone 1      in Zone 2        in Zone 3

Each S2 Cell is hashed to a specific DISCO server.
→ All requests and drivers within the same area are processed by the SAME DISCO server.
→ No conflict during assignment.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;SWIM Protocol:&lt;/strong&gt; Every DISCO node periodically "pings" other nodes to check if they are alive. If a node goes down, the ring auto-rebalances — the S2 cells of the dead node are transferred to neighboring nodes in the ring.&lt;/p&gt;




&lt;h2&gt;
  
  
  Fault Tolerance: State Digest
&lt;/h2&gt;

&lt;p&gt;The problem: If an Uber data center goes down abruptly, will the states of all in-flight trips be lost?&lt;/p&gt;

&lt;p&gt;Uber's clever solution: &lt;strong&gt;Encrypted State Digest&lt;/strong&gt; — DISCO periodically pushes the trip state (encrypted) down to be stored directly on &lt;strong&gt;the driver's phone&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Normal Flow:
  DISCO Server ◄──── state ────► Driver Phone
  (Source of truth)               (Backup copy)

When a data center crashes:
  1. A new DISCO Server boots up
  2. Driver phones reconnect
  3. DISCO requests driver phones to send back the state digest
  4. DISCO decrypts and restores the state of all in-flight trips
  5. The system continues operating without dropping a single ride
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  From Heuristics to Machine Learning: Gojek Jaeger
&lt;/h2&gt;

&lt;p&gt;Gojek's evolution from a simple dispatch heuristic to a production ML system is a masterclass in how marketplace complexity forces you to rethink single-objective optimization.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem with a Single ML Model
&lt;/h3&gt;

&lt;p&gt;Gojek's initial approach used a single machine learning model to rank and select drivers. It worked — until it didn't. The model optimized aggressively for its primary metric (say, acceptance rate) and created &lt;strong&gt;feedback loops&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Feedback Loop Problem:
  1. Model learns: "Drivers in Zone A accept 90% of the time"
  2. Model always sends orders to Zone A drivers
  3. Zone A drivers get overwhelmed, acceptance rate drops
  4. Zone B drivers are idle, marketplace liquidity drops
  5. Riders in Zone B wait longer → lower satisfaction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single-objective model cannot balance competing goals without explicit guardrails.&lt;/p&gt;

&lt;h3&gt;
  
  
  Jaeger: Multi-Objective Allocation
&lt;/h3&gt;

&lt;p&gt;Gojek built &lt;strong&gt;Jaeger&lt;/strong&gt; — a multi-objective allocation framework that simultaneously optimizes for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Jaeger Optimization Objectives:
  ↓ Minimize pickup ETA          (rider experience)
  ↑ Maximize driver utilization  (driver earnings)
  ↑ Maximize acceptance rate     (marketplace flow)
  ↑ Ensure fairness              (prevent driver starvation)

Architecture:
  [Real-time Features]  ←── GPS, traffic, demand heatmaps
        │
        ▼
  [ML Models]           ←── Acceptance probability, ETA model
        │
        ▼
  [Manual Configs]      ←── Business rules, fairness floors
        │
        ▼
  [Jaeger Aggregator]   ←── Weights &amp;amp; combines all signals
        │
        ▼
  [Driver Score]        ←── Final ranking for dispatch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;Manual Configs layer&lt;/strong&gt; is intentional: it gives business teams control to override ML decisions in edge cases (market launches, weather events, regulatory requirements) without retraining the entire model.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Future: Reinforcement Learning &amp;amp; MDP in Dispatching (Grab DispatchGym)
&lt;/h2&gt;

&lt;p&gt;The Hungarian algorithm solves the immediate assignment optimally. But it has a fundamental limitation: &lt;strong&gt;it only optimizes for the current batch&lt;/strong&gt;. It cannot answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Should I assign a driver now, or wait 30 seconds for a better match?&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;If I send this driver 5km across town, will that leave a coverage gap for the next surge?&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where &lt;strong&gt;Reinforcement Learning (RL)&lt;/strong&gt; and the &lt;strong&gt;Markov Decision Process (MDP)&lt;/strong&gt; formulation come in.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dispatch as a Markov Decision Process
&lt;/h3&gt;

&lt;p&gt;An MDP models dispatch as a sequence of decisions where &lt;strong&gt;each action affects future states&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MDP Formulation for Dispatch:

State (S):
  - Current positions of all idle drivers
  - Pending ride requests and their locations
  - Time of day, predicted demand heatmap
  - Traffic conditions across all zones

Action (A):
  - Assign Driver D to Rider R
  - Hold Driver D idle (wait for a better request)
  - Reposition Driver D to Zone X (proactive repositioning)

Transition (T):
  - Probability that action A in state S leads to state S'
  - e.g., P(driver ends up in Zone B | assigned to trip starting in Zone A)

Reward (R):
  - Positive: completed trip, short pickup ETA, high acceptance
  - Negative: long idle time, deadhead mileage, rider cancellation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key difference from greedy matching: the RL agent learns to make decisions that &lt;strong&gt;maximize cumulative long-term reward&lt;/strong&gt; — not just the immediate reward for a single trip.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grab DispatchGym
&lt;/h3&gt;

&lt;p&gt;Grab built &lt;strong&gt;DispatchGym&lt;/strong&gt; to make RL research accessible for dispatching problems. Its design solves a core challenge: training RL agents in production is dangerous (bad policies lose money and drivers). DispatchGym provides a &lt;strong&gt;safe, simulated environment&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DispatchGym Architecture:

┌───────────────────────────────────┐
│         Simulation Layer           │
│  - Replays historical trip data    │
│  - Injects synthetic demand spikes │
│  - Models driver behavior          │
└────────────────┬──────────────────┘
                 │  State observation
                 ▼
┌───────────────────────────────────┐
│         RL Agent (Policy)          │
│  - Gymnasium API compatible        │
│  - Trainable with any RL algo      │
│    (PPO, SAC, DQN...)              │
└────────────────┬──────────────────┘
                 │  Action (dispatch decision)
                 ▼
┌───────────────────────────────────┐
│        Reward Computation          │
│  - Total completed trips           │
│  - Average pickup ETA              │
│  - Driver earnings equity          │
│  - Acceptance rate                 │
└───────────────────────────────────┘

Deployment:
  Reward stabilizes → A/B test in production → Gradual rollout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Multi-Agent RL: When One Agent Isn't Enough
&lt;/h3&gt;

&lt;p&gt;A single RL agent controlling an entire city's fleet hits the &lt;strong&gt;curse of dimensionality&lt;/strong&gt; — the state-action space is too large. The solution is &lt;strong&gt;Multi-Agent Reinforcement Learning (MARL)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MARL for Dispatch:
  - Each geographic zone (or each driver) is an independent agent
  - Agents observe their local state (nearby riders, driver density)
  - Agents take local actions (match, hold, reposition)
  - Coordination mechanism ensures global objectives are met

Paradigm: Centralized Training, Decentralized Execution (CTDE)
  - During training: agents share global information to learn cooperation
  - During execution: each agent acts on local observations only
  - Result: Scales to city-wide fleets without centralized bottleneck
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Academic research on MARL-based dispatch suggests wait time reductions of &lt;strong&gt;25–40% compared to greedy baselines&lt;/strong&gt;, with significant improvements in driver idle mileage — though real-world results vary by city density and fleet size.&lt;/p&gt;




&lt;h2&gt;
  
  
  Grab's Fulfilment Platform Architecture
&lt;/h2&gt;

&lt;p&gt;Grab doesn't just run ride-hailing — it runs food, groceries, express delivery, and financial services on the same driver network. Early on, each vertical had its own dispatch engine, causing massive inefficiency.&lt;/p&gt;

&lt;p&gt;Grab solved this with the &lt;strong&gt;Fulfilment Platform&lt;/strong&gt; — a unified three-layer architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────────────────────────┐
│         Business Verticals             │
│  GrabCar | GrabFood | GrabMart | ...   │
└───────────────┬────────────────────────┘
                │ Demand signals
                ▼
┌────────────────────────────────────────┐
│         Fulfilment Platform            │
│  - Unified dispatch engine             │
│  - Supply shaping &amp;amp; driver incentives  │
│  - Global optimization across verticals│
└───────────────┬────────────────────────┘
                │ Infrastructure
                ▼
┌────────────────────────────────────────┐
│         Technology Infrastructure      │
│  - DynamoDB (OLTP: live orders)        │
│  - MySQL partitioned (OLAP: analytics) │
│  - 1,000+ microservices on AWS/GCP     │
└────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical innovation: a driver finishing a GrabFood delivery can be &lt;strong&gt;immediately available for a GrabCar ride&lt;/strong&gt; — the same Fulfilment Platform sees the full picture and can optimize across verticals. This eliminated "dead time" between trips and significantly improved driver earnings.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Metrics for the Matching Engine
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;P50/P99 Matching Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Time from request received to dispatch&lt;/td&gt;
&lt;td&gt;&amp;lt; 2 seconds (P99)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Acceptance Rate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;% of drivers accepting when offered&lt;/td&gt;
&lt;td&gt;&amp;gt; 85%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ETA Accuracy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Error margin between predicted and actual ETA&lt;/td&gt;
&lt;td&gt;&amp;lt; 20%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Match Rate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;% of requests successfully matched&lt;/td&gt;
&lt;td&gt;&amp;gt; 95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total Wait Time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Total time a rider waits (rider + driver travel time)&lt;/td&gt;
&lt;td&gt;Minimize&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Driver Idle Mileage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Distance driven without a passenger&lt;/td&gt;
&lt;td&gt;Minimize&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Marketplace Liquidity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Balance of supply/demand across zones&lt;/td&gt;
&lt;td&gt;Maintain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  FAQ: Dispatch Algorithms
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is a dispatch algorithm?&lt;/strong&gt;&lt;br&gt;
A dispatch algorithm is a system used by ride-hailing platforms like Uber and Grab to optimally match riders with available drivers. It goes beyond finding the closest driver by using batched matching and global optimization to minimize the total wait time (ETA) for all users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does Uber match riders and drivers?&lt;/strong&gt;&lt;br&gt;
Uber uses a system called DISCO that batches ride requests every 2-5 seconds and applies the Hungarian algorithm for global optimization across the entire batch. This ensures system-wide efficiency, not just locally optimal matches for each individual request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why is reinforcement learning used in ride-hailing?&lt;/strong&gt;&lt;br&gt;
RL models treat dispatching as a Markov Decision Process (MDP), allowing the system to optimize for long-term marketplace liquidity and driver earnings, rather than just immediate wait times. Platforms like Grab use RL via DispatchGym to learn when to hold a driver for a better future match, rather than always assigning immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the difference between S2 and H3 in dispatch systems?&lt;/strong&gt;&lt;br&gt;
S2 (Google) uses square cells and is excellent for precise geofencing and hierarchical sharding. H3 (Uber) uses hexagonal cells, which offer uniform adjacency — all 6 neighbors are equidistant — making proximity searches and surge pricing heatmaps more accurate for real-time matching.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Next, we will look into Surge Pricing — the dynamic pricing system based on real-time supply and demand ratios. Continue reading &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-5-pricing-surge-engine/"&gt;Part 5 — Surge Pricing: Dynamic Pricing Based on Real-time Supply and Demand&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ride-hailing-realtime-architecture/part-4-dispatch-matching-engine/" rel="noopener noreferrer"&gt;Ride-Hailing Dispatch Algorithm: How Uber DISCO &amp;amp; Grab DispatchGym Match Drivers&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>algorithms</category>
      <category>go</category>
    </item>
    <item>
      <title>[System Design] H3 Geospatial Indexing: How Uber Finds Nearby Drivers with Hexagonal Spatial Index</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Sun, 14 Jun 2026 02:06:33 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-h3-geospatial-indexing-how-uber-finds-nearby-drivers-with-hexagonal-spatial-index-2kpa</link>
      <guid>https://dev.to/vesviet/system-design-h3-geospatial-indexing-how-uber-finds-nearby-drivers-with-hexagonal-spatial-index-2kpa</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Finding a Needle in a Haystack
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Answer-first:&lt;/strong&gt; Uber and Grab find the nearest available driver in under 100ms by dividing the Earth's surface into hexagonal cells (H3 index at Resolution 8, each ~0.74 km²). Instead of calculating distance to every driver, they look up only the 7 cells nearest to the rider — reducing millions of comparisons to dozens.&lt;/p&gt;

&lt;p&gt;When you tap "Book" on Grab, the system must find the most suitable driver within a radius of a few kilometers. But the system is tracking millions of drivers simultaneously. The naive approach — calculating the distance from you to &lt;strong&gt;every&lt;/strong&gt; driver — is impossible:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;Naive&lt;/span&gt; &lt;span class="n"&gt;Approach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Brute&lt;/span&gt; &lt;span class="k"&gt;Force&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rider_location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt; &lt;span class="c1"&gt;-- 2km&lt;/span&gt;
  &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rider_location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;With&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;million&lt;/span&gt; &lt;span class="n"&gt;drivers&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;million&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="n"&gt;calculations&lt;/span&gt;
&lt;span class="n"&gt;Latency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;takes&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;Unacceptable&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The solution: &lt;strong&gt;Divide the map into a grid (Spatial Indexing)&lt;/strong&gt; to narrow down the search space from millions to just a few dozen.&lt;/p&gt;




&lt;h2&gt;
  
  
  Method 1: Geohash
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Geohash&lt;/strong&gt; encodes coordinates (latitude, longitude) into a character string. The longer the string, the smaller and more precise the grid square:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Coordinates: 10.7769, 106.7009 (District 1, HCMC)
Geohash: w3gvk1   (cell ~1.2km × 0.6km)
         w3gvk1e  (cell ~153m × 153m)
         w3gvk1e7 (cell ~38m × 19m)

Important characteristic: Prefix sharing
  w3gvk1e  ← Cells sharing a prefix are located near each other
  w3gvk1f
  w3gvk1g
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Advantages
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Simple and easy to understand.&lt;/li&gt;
&lt;li&gt;Database-friendly: you can use a LIKE query (&lt;code&gt;WHERE geohash LIKE 'w3gvk1%'&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Redis supports it natively (&lt;code&gt;GEOADD&lt;/code&gt;, &lt;code&gt;GEOSEARCH&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A Serious Disadvantage: The Edge Problem
&lt;/h3&gt;

&lt;p&gt;Geohash divides the map into squares. Two points right next to each other, but situated on the &lt;strong&gt;edges&lt;/strong&gt; of two completely different squares, will have entirely different prefixes — the system won't find the driver even if they are only 50 meters away.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────┬──────────┐
│          │          │
│  w3gvk1  │  w3gvk3  │  ← Two cells with completely different prefixes
│          │          │
│    ●──────────○     │  ← Very close but in different cells
│          │          │
└──────────┴──────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Workaround:&lt;/strong&gt; Always search the &lt;strong&gt;current cell + the 8 neighboring cells&lt;/strong&gt; (a 3x3 grid). But this creates many false positives in the corners.&lt;/p&gt;




&lt;h2&gt;
  
  
  Method 2: H3 — Uber's Hexagonal Grid
&lt;/h2&gt;

&lt;p&gt;Uber developed &lt;strong&gt;H3&lt;/strong&gt; (Hexagonal Hierarchical Spatial Index) to resolve all the limitations of Geohash. It is an open-source spatial indexing system that partitions the entire surface of the Earth into &lt;strong&gt;hexagons&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why are hexagons better than squares?
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Squares (Geohash):             Hexagons (H3):

┌────┬────┬────┐                 ╱╲    ╱╲
│    │    │    │                ╱    ╲╱    ╲
│  d │  ? │    │               │  d  ││  d  │
│    │    │    │                ╲    ╱╲    ╱
├────┼────┼────┤                 ╲╱    ╲╱
│  d │  ● │    │                 ╱╲  ● ╱╲
│    │    │    │                │  d  ││  d │
├────┼────┼────┤                ╲    ╱╲    ╱
│    │    │    │                 ╲╱    ╲╱
└────┴────┴────┘                 ╱╲    ╱╲
                                │  d  ││  d │
d = distance to ●              ╲    ╱╲    ╱
                                  ╲╱    ╲╱

Squares: 4 near neighbors (edges)   Hexagons: 6 neighbors
         4 far neighbors (corners)            ALL at the SAME distance!
         → Uneven distances                   → Uniform distance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Equidistant:&lt;/strong&gt; Every neighboring cell of a hexagon is an equal distance from its center. With squares, corner cells are √2 times (~1.41 times) further away than edge cells. This unevenness creates biases in search algorithms.&lt;/p&gt;

&lt;h3&gt;
  
  
  16 Levels of Resolution
&lt;/h3&gt;

&lt;p&gt;H3 supports 16 levels of resolution, from level 0 (covering continents) to level 15 (a few square meters):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resolution&lt;/th&gt;
&lt;th&gt;Average Area&lt;/th&gt;
&lt;th&gt;Used For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;~4,357,449 km²&lt;/td&gt;
&lt;td&gt;Continental level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;~1,770 km²&lt;/td&gt;
&lt;td&gt;City/Province level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;~5.16 km²&lt;/td&gt;
&lt;td&gt;Surge Pricing zones&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;~0.74 km²&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Finding drivers (most common)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;~0.105 km²&lt;/td&gt;
&lt;td&gt;Precise matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;~0.003 km²&lt;/td&gt;
&lt;td&gt;Street-level&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Uber uses Resolutions 7-9&lt;/strong&gt; for the vast majority of its operations: finding drivers, calculating surge pricing, and analyzing supply and demand.&lt;/p&gt;

&lt;h3&gt;
  
  
  K-Ring — Neighborhood Search
&lt;/h3&gt;

&lt;p&gt;A K-Ring is a collection of all cells within K steps from a center cell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;K=0 (center cell only):     K=1 (center cell + 6 neighbors):

       ╱╲                           ╱╲    ╱╲
      ╱  ╲                        ╱    ╲╱    ╲
     │ ●  │                       │    ││    │
      ╲  ╱                        ╲    ╱╲    ╱
       ╲╱                          ╲╱    ╲╱
                                    ╱╲  ● ╱╲
     1 cell                       │    ││    │
                                   ╲    ╱╲    ╱
                                    ╲╱    ╲╱
                                    7 cells

K=2:  19 cells     K=3: 37 cells
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Algorithm to find drivers:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert the rider's coordinates into an H3 index (resolution 8).&lt;/li&gt;
&lt;li&gt;Calculate K-Ring(1) → 7 hexagonal cells.&lt;/li&gt;
&lt;li&gt;Find all drivers currently in these 7 cells from Redis.&lt;/li&gt;
&lt;li&gt;If not enough drivers are found → expand to K-Ring(2) → 19 cells.&lt;/li&gt;
&lt;li&gt;Sort by actual ETA (not just straight-line distance).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Code Example (Go)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/uber/h3-go/v4"&lt;/span&gt;

&lt;span class="c"&gt;// Convert coordinates to H3 index&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;findNearbyDrivers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;riderLat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;riderLng&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Resolution 8: each cell is ~0.74 km²&lt;/span&gt;
    &lt;span class="n"&gt;riderCell&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;h3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LatLngToCell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LatLng&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Lat&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;riderLat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Lng&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;riderLng&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// K-Ring: get all cells within a radius of K steps&lt;/span&gt;
    &lt;span class="n"&gt;searchCells&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;h3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GridDisk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;riderCell&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;driverIDs&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cell&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;searchCells&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Look up in Redis: key = H3 cell, value = list of driver IDs&lt;/span&gt;
        &lt;span class="n"&gt;cellKey&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"drivers:h3:%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cell&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;ids&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;redisClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SMembers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cellKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Val&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;driverIDs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driverIDs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;driverIDs&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Method 3: Google S2 Geometry
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;S2&lt;/strong&gt; (developed by Google) is also utilized by Uber in their DISCO system (Matching Engine). S2 divides the Earth's surface into square cells based on a projection onto a cube (following a Hilbert Curve).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Characteristics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each cell is represented by a &lt;strong&gt;64-bit integer&lt;/strong&gt; → much faster for comparing, hashing, and sharding.&lt;/li&gt;
&lt;li&gt;30 levels of resolution.&lt;/li&gt;
&lt;li&gt;Google Maps, Foursquare, and MongoDB Geospatial all use S2.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/golang/geo/s2"&lt;/span&gt;

&lt;span class="c"&gt;// S2: Find all cells covering a 2km radius from a point&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;getCoveringCells&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lng&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radiusM&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CellID&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;center&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PointFromLatLng&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LatLngFromDegrees&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lng&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nb"&gt;cap&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CapFromCenterAngle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;center&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Angle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;radiusM&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;6371000.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;coverer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RegionCoverer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;MinLevel&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;MaxLevel&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;MaxCells&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;coverer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Covering&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Storing Locations: Redis GEO
&lt;/h2&gt;

&lt;p&gt;The locations of all online drivers are stored entirely in &lt;strong&gt;RAM&lt;/strong&gt; (Redis) because query speeds are &amp;lt; 1ms. Traditional databases are not used because they are far too slow for 1.25 million writes/second.&lt;/p&gt;

&lt;h3&gt;
  
  
  Approach 1: Redis GEO (Built-in)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Add/update a driver's location
GEOADD drivers:active 106.7009 10.7769 "driver:abc123"

-- Find drivers within a 2km radius
GEOSEARCH drivers:active FROMLONLAT 106.7009 10.7769 BYRADIUS 2 km ASC COUNT 20
-- Result: ["driver:abc123", "driver:def456", ...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Approach 2: Redis SET + H3 Index (Uber's approach)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Each H3 cell is a Redis SET containing driver IDs
SADD drivers:h3:882a100d6dfffff "driver:abc123"
SADD drivers:h3:882a100d6dfffff "driver:def456"

-- When a driver moves to a new cell:
SREM drivers:h3:882a100d6dfffff "driver:abc123"  -- Remove from old cell
SADD drivers:h3:882a100d71fffff "driver:abc123"  -- Add to new cell

-- TTL to automatically clean up offline drivers:
-- Combine SET + EXPIRE or use a Sorted Set with timestamps
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Comparing the Two Approaches
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Redis GEO&lt;/th&gt;
&lt;th&gt;Redis SET + H3&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Radius Search&lt;/td&gt;
&lt;td&gt;✅ Built-in &lt;code&gt;GEOSEARCH&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;❌ Must calculate K-Rings manually&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sharding&lt;/td&gt;
&lt;td&gt;❌ Hard to shard (1 key holds everything)&lt;/td&gt;
&lt;td&gt;✅ Each H3 cell is a separate key → easy to shard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scale Limit&lt;/td&gt;
&lt;td&gt;~a few million&lt;/td&gt;
&lt;td&gt;~tens of millions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Surge/Analytics Integration&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ Count drivers/cell → calculate supply/demand instantly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Uber uses the second approach (Redis + H3) because its sharding capabilities and integration with the Pricing Engine are far superior.&lt;/p&gt;




&lt;h2&gt;
  
  
  Consistent Hashing — Distributing GEO Data
&lt;/h2&gt;

&lt;p&gt;When a massive city like Ho Chi Minh City has 200,000 online drivers, a single Redis node might not have enough RAM. The solution is &lt;strong&gt;Consistent Hashing&lt;/strong&gt; to distribute the H3 cells across multiple Redis nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;H3 Cell "882a100d6dfffff"  →  hash() → Node A
H3 Cell "882a100d71fffff"  →  hash() → Node B
H3 Cell "882a100d73fffff"  →  hash() → Node A

Neighboring cells (a K-Ring) might reside on different nodes
→ A K-Ring query = a scatter-gather operation across multiple nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://eng.uber.com/h3/" rel="noopener noreferrer"&gt;Uber Engineering: H3 Hexagonal Hierarchical Spatial Index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://s2geometry.io/" rel="noopener noreferrer"&gt;Google S2 Geometry Library&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Next, we will delve into the backbone of the entire system — Apache Kafka — where every GPS event, ride request, and acceptance flows. Continue reading &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-3-event-streaming-kafka/"&gt;Part 3 — Event Streaming: The Apache Kafka &amp;amp; Flink Backbone&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ride-hailing-realtime-architecture/part-2-geospatial-indexing/" rel="noopener noreferrer"&gt;H3 Geospatial Indexing: How Uber Finds Nearby Drivers with Hexagonal Spatial Index&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>database</category>
      <category>geospatial</category>
    </item>
    <item>
      <title>[System Design] GPS Location Ingestion at Scale: gRPC Streaming, MQTT &amp; Kalman Filter in Ride-Hailing</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Sat, 13 Jun 2026 10:38:16 +0000</pubDate>
      <link>https://dev.to/vesviet/system-design-gps-location-ingestion-at-scale-grpc-streaming-mqtt-kalman-filter-in-3pm</link>
      <guid>https://dev.to/vesviet/system-design-gps-location-ingestion-at-scale-grpc-streaming-mqtt-kalman-filter-in-3pm</guid>
      <description>&lt;h2&gt;
  
  
  The Challenge: Millions of Drivers, Every 4 Seconds
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Answer-first:&lt;/strong&gt; Uber and Grab handle 1.25 million GPS write operations per second from ~5 million active drivers. HTTP REST fails at this scale due to per-request TCP+TLS handshake overhead. The solution is persistent connections (gRPC streams or MQTT) with Protobuf serialization, Kalman Filter noise reduction, and batched coordinate uploads — cutting network calls by 67% while maintaining sub-200ms end-to-end latency.&lt;/p&gt;

&lt;p&gt;Grab has approximately &lt;strong&gt;5 million drivers&lt;/strong&gt; operating in Southeast Asia. Uber has over &lt;strong&gt;5 million drivers&lt;/strong&gt; globally. If every driver sends a GPS coordinate every 4 seconds, the system must receive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;5,000,000 drivers ÷ 4 seconds = 1,250,000 GPS packets / second
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is &lt;strong&gt;1.25 million write operations per second&lt;/strong&gt; — just for location data alone, not counting other requests. Traditional HTTP REST cannot handle this load efficiently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why is HTTP REST Unsuitable?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Connection Overhead
&lt;/h3&gt;

&lt;p&gt;Every HTTP request requires:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;TCP Handshake&lt;/strong&gt; (3-way handshake)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TLS Handshake&lt;/strong&gt; (an additional 1-2 round trips if using HTTPS)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Headers&lt;/strong&gt; (~200-800 bytes of overhead per request)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Closing the connection&lt;/strong&gt; (or keep-alive timeouts)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With a GPS payload of only about &lt;strong&gt;50-100 bytes&lt;/strong&gt; (latitude, longitude, timestamp, speed, heading), the HTTP overhead is 5-10 times larger than the actual data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Battery Drain
&lt;/h3&gt;

&lt;p&gt;Every time a new HTTP connection is established, the phone's radio chip (4G/5G) must transition from an idle state to an active state, consuming significant energy. If a driver works 8-12 hours a day, the battery will drain rapidly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Ultra-lightweight Protocols
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. MQTT (Message Queuing Telemetry Transport)
&lt;/h3&gt;

&lt;p&gt;MQTT was originally designed for IoT devices with limited bandwidth and small batteries — perfect for continuous GPS tracking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why MQTT is efficient:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Connection:&lt;/strong&gt; Handshakes only once, then continuously sends coordinates without re-establishing the connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tiny Headers:&lt;/strong&gt; Only 2 bytes of overhead compared to hundreds of bytes for HTTP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexible Quality of Service (QoS):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;QoS 0: Fire-and-forget — suitable for GPS because a new coordinate will arrive a few seconds later anyway.&lt;/li&gt;
&lt;li&gt;QoS 1: Guarantees delivery at least once.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Last Will and Testament (LWT):&lt;/strong&gt; If a driver loses connection abruptly, the MQTT broker automatically notifies the server that the driver has gone offline.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;MQTT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flow:&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Driver&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;App&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MQTT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Publish&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MQTT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Broker&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Kafka&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;Topic:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"driver/{driver_id}/location"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;Payload:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"lat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;10.7769&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"lng"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;106.7009&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1717689600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"speed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;35.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"heading"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"accuracy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. gRPC Bidirectional Streaming
&lt;/h3&gt;

&lt;p&gt;Uber later transitioned to using &lt;strong&gt;gRPC Streams&lt;/strong&gt; (and QUIC/HTTP3) instead of MQTT for many real-time data streams.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages over MQTT:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Protobuf serialization:&lt;/strong&gt; Binary payloads are 3-10 times smaller than JSON.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bidirectional:&lt;/strong&gt; The server can push data back (ride offers, navigation) over the same connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flow control:&lt;/strong&gt; gRPC has built-in backpressure, crucial when servers are overloaded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiplexing:&lt;/strong&gt; HTTP/2 allows multiple streams over a single TCP connection, avoiding head-of-line blocking.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// gRPC service definition for Location Streaming&lt;/span&gt;
&lt;span class="kd"&gt;service&lt;/span&gt; &lt;span class="n"&gt;LocationService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Driver sends a continuous stream of coordinates&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;StreamLocation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;LocationUpdate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;ServerCommand&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;LocationUpdate&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;latitude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;longitude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int64&lt;/span&gt; &lt;span class="na"&gt;timestamp_ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="na"&gt;speed_mps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// meters/second&lt;/span&gt;
  &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="na"&gt;heading_degrees&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// direction (0-360)&lt;/span&gt;
  &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="na"&gt;accuracy_meters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// GPS accuracy&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;driver_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;ServerCommand&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;NewRideOffer&lt;/span&gt; &lt;span class="na"&gt;ride_offer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;NavigationUpdate&lt;/span&gt; &lt;span class="na"&gt;nav&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;PingRequest&lt;/span&gt; &lt;span class="na"&gt;ping&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Optimization Techniques on the Mobile App
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Batching — Grouping Coordinates
&lt;/h3&gt;

&lt;p&gt;Instead of sending each coordinate individually, the app batches 3-5 GPS points and sends them together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Without batching:      With batching:
GPS 1 → Send          GPS 1 ─┐
GPS 2 → Send          GPS 2 ─┤ → Send 1 batch
GPS 3 → Send          GPS 3 ─┘
GPS 4 → Send          GPS 4 ─┐
GPS 5 → Send          GPS 5 ─┤ → Send 1 batch
GPS 6 → Send          GPS 6 ─┘

6 network calls        2 network calls (67% savings)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Adaptive Frequency
&lt;/h3&gt;

&lt;p&gt;The app doesn't need to send coordinates at a strict 4-second interval. It adjusts the frequency based on context:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Driver State&lt;/th&gt;
&lt;th&gt;GPS Frequency&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Waiting for a ride (idle)&lt;/td&gt;
&lt;td&gt;Every 15-30 seconds&lt;/td&gt;
&lt;td&gt;Saves battery, driver isn't moving much&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Moving to find passengers&lt;/td&gt;
&lt;td&gt;Every 4-5 seconds&lt;/td&gt;
&lt;td&gt;Needs updated location for matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-trip (carrying a passenger)&lt;/td&gt;
&lt;td&gt;Every 2-3 seconds&lt;/td&gt;
&lt;td&gt;Needs high accuracy for ETA and mapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High speed (&amp;gt;60 km/h)&lt;/td&gt;
&lt;td&gt;Every 1-2 seconds&lt;/td&gt;
&lt;td&gt;Moving fast, location changes rapidly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stationary (at a red light)&lt;/td&gt;
&lt;td&gt;Paused&lt;/td&gt;
&lt;td&gt;GPS hasn't changed, 100% savings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Dead Reckoning — Position Interpolation
&lt;/h3&gt;

&lt;p&gt;Between receiving GPS points, the rider app uses &lt;strong&gt;Dead Reckoning&lt;/strong&gt; (trajectory prediction) to make the car appear to move smoothly on the map:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Actual received GPS positions: ● (every 4 seconds)
Interpolated positions: ○ (every 100ms)

●───○───○───○───○───○───○───●───○───○───○───○───●
t=0                         t=4s                t=8s

Simple interpolation formula:
  predicted_lat = last_lat + (speed × cos(heading) × Δt)
  predicted_lng = last_lng + (speed × sin(heading) × Δt)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Filtering GPS Noise: The Kalman Filter
&lt;/h2&gt;

&lt;p&gt;GPS signals from phones are often &lt;strong&gt;noisy&lt;/strong&gt; — especially in cities with tall buildings where signals reflect (the urban canyon effect). The result: cars on the map jump erratically or appear to drive through buildings.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Kalman Filter Works
&lt;/h3&gt;

&lt;p&gt;The Kalman Filter is a continuous &lt;strong&gt;predict → update&lt;/strong&gt; algorithm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Loop for every new GPS point:

1. PREDICT:
   Based on the previous position, speed, and heading,
   predict the current position using a physics model.

   predicted_position = previous_position + velocity × Δt

2. UPDATE (Correct):
   Compare the prediction with the actual received GPS point.
   Assign weights: trust the prediction more, or the GPS more?

   If GPS accuracy = 5m (accurate): trust the GPS more
   If GPS accuracy = 50m (noisy):   trust the prediction more

   final_position = weighted_average(predicted, gps_measured)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lyft Engineering described in their technical blog that they use a &lt;strong&gt;Marginalized Particle Filter&lt;/strong&gt; — a more advanced version of the Kalman Filter — which can track multiple possible trajectories simultaneously when it's uncertain which road the car is on (e.g., a main road vs. a parallel frontage road).&lt;/p&gt;

&lt;h3&gt;
  
  
  Map Matching — Snapping to the Road
&lt;/h3&gt;

&lt;p&gt;After filtering the noise, the system performs &lt;strong&gt;Map Matching&lt;/strong&gt; — snapping the filtered coordinates onto a digitized road network:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw GPS (after Kalman):   Map Matched:

    ○  ○                     ●──●
   ○    ○                   ●    ●
  ○      ○   ──────►       ●      ●
   ○    ○                   ●    ●
    ○  ○                     ●──●

(Scattered coordinates)   (Snapped closely to the road)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Complete Location Pipeline Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Driver Phone GPS Sensor
        │
        ▼
  ┌───────────────┐
  │  Kalman Filter │ (On the phone)
  │  + Batching    │
  └───────┬───────┘
          │ gRPC Stream / MQTT
          ▼
  ┌───────────────┐
  │  Load Balancer │ (Nginx / Envoy)
  └───────┬───────┘
          │
          ▼
  ┌───────────────┐
  │  Location     │ → Validate, enrich, convert to H3 index
  │  Service      │
  └───────┬───────┘
          │ Produce to Kafka
          ▼
  ┌───────────────┐
  │  Apache Kafka  │  Topic: "driver.location.updates"
  └───┬───┬───┬───┘
      │   │   │
      ▼   ▼   ▼
   Redis  Flink  Data Lake
   GEO    (Stream  (S3/HDFS
   Cache  Processing) Analytics)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Real-world Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Estimated Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Concurrent online drivers (Grab SEA)&lt;/td&gt;
&lt;td&gt;~1.5 million&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average GPS frequency&lt;/td&gt;
&lt;td&gt;Every 4 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Location Service Throughput&lt;/td&gt;
&lt;td&gt;~375,000 msg/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPS payload size (Protobuf)&lt;/td&gt;
&lt;td&gt;~40-60 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raw GPS bandwidth&lt;/td&gt;
&lt;td&gt;~15-22 MB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;End-to-end latency (phone → Redis)&lt;/td&gt;
&lt;td&gt;&amp;lt; 200ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Next, we will explore how Uber uses the H3 algorithm to divide the map into millions of hexagons and find the closest driver in the blink of an eye. Continue reading &lt;a href="https://dev.to/series/ride-hailing-realtime-architecture/part-2-geospatial-indexing/"&gt;Part 2 — Geospatial Indexing: H3, S2 Geometry &amp;amp; Redis GEO&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;This post was originally published on my blog at &lt;a href="https://tanhdev.com/series/ride-hailing-realtime-architecture/part-1-location-ingestion/" rel="noopener noreferrer"&gt;GPS Location Ingestion at Scale: gRPC Streaming, MQTT &amp;amp; Kalman Filter in Ride-Hailing&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hi, I'm Lê Tuấn Anh (vesviet) 👋&lt;/strong&gt; &lt;br&gt;
&lt;em&gt;I am a Senior Go Backend Architect &amp;amp; Distributed Systems Engineer with 17+ years of experience building high-traffic platforms (25M+ requests/month).&lt;/em&gt; &lt;br&gt;
&lt;em&gt;If you enjoyed this deep-dive, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or explore my consulting services at &lt;a href="https://tanhdev.com/hire" rel="noopener noreferrer"&gt;tanhdev.com/hire&lt;/a&gt;.&lt;/em&gt; &lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>go</category>
      <category>ridehailing</category>
    </item>
    <item>
      <title>The Landscape of Core Banking Developers: Building Financial Systems from Scratch</title>
      <dc:creator>Tuấn Anh</dc:creator>
      <pubDate>Sat, 06 Jun 2026 15:17:38 +0000</pubDate>
      <link>https://dev.to/vesviet/the-landscape-of-core-banking-developers-building-financial-systems-from-scratch-5118</link>
      <guid>https://dev.to/vesviet/the-landscape-of-core-banking-developers-building-financial-systems-from-scratch-5118</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This post is the Executive Summary of my 8-part long-form series: *&lt;/em&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/" rel="noopener noreferrer"&gt;Learning Path to Become a Core Banking Developer&lt;/a&gt;** originally published on my blog.*&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Who is a Core Banking Developer?
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Core Banking Developer&lt;/strong&gt; is a software engineer responsible for building, operating, and extending the system that processes all of a bank's core financial operations — from managing accounts, processing money transfers, and calculating interest rates, to ensuring every single penny is recorded with absolute accuracy.&lt;/p&gt;

&lt;p&gt;Unlike a typical developer, a mistake for a Core Banking Developer doesn't just mean a 404 error page — it means &lt;strong&gt;customers' money being lost, duplicated, or the general ledger becoming unbalanced&lt;/strong&gt;. This intense pressure defines their entire approach to writing code and designing systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why is this field special?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Absolute Accuracy
&lt;/h3&gt;

&lt;p&gt;In regular software development, "eventual consistency" is often acceptable. In Core Banking, &lt;strong&gt;a transaction either completely succeeds or does not happen at all&lt;/strong&gt;. There is no in-between state. This is why ACID database transactions are an indispensable foundation.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Extremely High Concurrency
&lt;/h3&gt;

&lt;p&gt;Millions of users can perform transactions simultaneously within the same second. The system must handle concurrency without allowing race conditions that could lead to incorrect deductions or double credits.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Compliance and Legal Requirements
&lt;/h3&gt;

&lt;p&gt;Every action in a Core Banking system must have an audit trail. The State Bank, tax authorities, and international organizations reserve the right to review the entire transaction history at any given time.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Knowledge Map of a Core Banking Developer
&lt;/h2&gt;

&lt;p&gt;If you want to transition from a regular backend engineer to a financial systems architect, you need to bridge both domain knowledge and heavy technical architecture.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────┐
│                   CORE BANKING DEVELOPER                        │
│                                                                 │
│  DOMAIN KNOWLEDGE          TECHNICAL SKILLS                     │
│  ─────────────────          ────────────────                    │
│  • Double-Entry (GL)       • Database (ACID, Locking)           │
│  • CASA (Deposits)         • Distributed Transactions           │
│  • Lending (Credit)        • Event-Driven Architecture          │
│  • Payments &amp;amp; Clearing     • API Design (REST/gRPC)             │
│  • Trade Finance           • Security &amp;amp; Encryption              │
│                                                                 │
│  STANDARDS &amp;amp; PROTOCOLS     ARCHITECTURE PATTERNS                │
│  ─────────────────────     ─────────────────────                │
│  • ISO 8583 (Card/ATM)     • Saga Pattern                       │
│  • ISO 20022 (SWIFT)       • Outbox Pattern                     │
│  • BIAN Framework          • CQRS &amp;amp; Event Sourcing              │
│  • PCI-DSS                 • Idempotency Keys                   │
└─────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Market Landscape
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Popular Core Banking Systems
&lt;/h3&gt;

&lt;p&gt;To understand the ecosystem, here are the dominant legacy and modern systems you will encounter in the banking industry (with examples of adoptions in regional markets like Vietnam):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Core Technology&lt;/th&gt;
&lt;th&gt;Banks Using It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Temenos T24&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Java, jBASE/BASIC&lt;/td&gt;
&lt;td&gt;Techcombank, VPBank, MB Bank, Sacombank&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Oracle Flexcube&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Java EE, Oracle DB&lt;/td&gt;
&lt;td&gt;VietinBank, BIDV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Infosys Finacle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Java&lt;/td&gt;
&lt;td&gt;Agribank (Under deployment)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;In-house (Custom)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Go, Java, Kotlin&lt;/td&gt;
&lt;td&gt;MoMo, ZaloPay, VCB Digibank&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Next-Generation Trends
&lt;/h3&gt;

&lt;p&gt;Digital banks and fintechs are no longer purchasing off-the-shelf monolithic Core Banking systems — they are &lt;strong&gt;building their own Core Banking systems using Microservices&lt;/strong&gt;. This represents a massive opportunity for full-stack developers with a mindset for distributed systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Ready to dive deeper?
&lt;/h2&gt;

&lt;p&gt;This was just the executive summary. If you want to actually build these systems, I have written a complete step-by-step roadmap for developers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-1-double-entry-ledger/" rel="noopener noreferrer"&gt;Part 1 — The Double-Entry Ledger Foundation&lt;/a&gt;&lt;/strong&gt; &lt;em&gt;(The mental foundation you cannot skip)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-2-banking-domain-casa-lending/" rel="noopener noreferrer"&gt;Part 2 — Core Banking Domain: CASA &amp;amp; Lending&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-3-database-transactions-acid/" rel="noopener noreferrer"&gt;Part 3 — Database Design for Financial Transactions&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-4-modern-core-banking-architecture/" rel="noopener noreferrer"&gt;Part 4 — Modern Core Banking Architecture&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-5-iso-standards-integration/" rel="noopener noreferrer"&gt;Part 5 — International Integration Standards (ISO 8583)&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-6-security-compliance-audit/" rel="noopener noreferrer"&gt;Part 6 — Security, Compliance &amp;amp; Audit&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/part-7-build-mini-core-banking/" rel="noopener noreferrer"&gt;Part 7 — Build a Mini Core Banking System from Scratch&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://tanhdev.com/series/core-banking-developer/" rel="noopener noreferrer"&gt;Read the full 8-part series on my blog&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Lê Tuấn Anh, a Go Backend Architect specializing in microservices migrations and financial platforms. If you found this useful, let's connect on &lt;a href="https://www.linkedin.com/in/vesviet" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>architecture</category>
      <category>microservices</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
