<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Bikash Kh</title>
    <description>The latest articles on DEV Community by Bikash Kh (@bikash_kh_).</description>
    <link>https://dev.to/bikash_kh_</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3993343%2F39785690-a474-404e-bdda-85d405ca4833.jpg</url>
      <title>DEV Community: Bikash Kh</title>
      <link>https://dev.to/bikash_kh_</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bikash_kh_"/>
    <language>en</language>
    <item>
      <title>How I Replicated Uber's Core Marketplace in Python: A Technical Deep Dive</title>
      <dc:creator>Bikash Kh</dc:creator>
      <pubDate>Fri, 19 Jun 2026 23:38:35 +0000</pubDate>
      <link>https://dev.to/bikash_kh_/how-i-replicated-ubers-core-marketplace-in-python-a-technical-deep-dive-52mc</link>
      <guid>https://dev.to/bikash_kh_/how-i-replicated-ubers-core-marketplace-in-python-a-technical-deep-dive-52mc</guid>
      <description>&lt;h1&gt;
  
  
  I Built UberSim v2.0: A Production-Grade Urban Mobility Intelligence Platform 🚗🧠
&lt;/h1&gt;

&lt;p&gt;Every time you open Uber and see a &lt;strong&gt;2.1× surge multiplier&lt;/strong&gt;, a complex system has already predicted demand, optimized prices, matched drivers, and logged events for future learning — all within milliseconds.&lt;/p&gt;

&lt;p&gt;I wanted to understand how those systems work.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;UberSim v2.0&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A Python-based urban mobility intelligence platform that simulates the core engineering challenges behind modern ride-sharing marketplaces.&lt;/p&gt;

&lt;p&gt;Instead of building another dashboard project, I wanted to recreate the intelligence layer behind a ride-sharing platform from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 What's Inside?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🧠 Demand Forecasting
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Spatio-temporal demand prediction (&lt;strong&gt;R² = 0.89&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Weather effects, seasonality, lag features, and neighboring zone influence&lt;/li&gt;
&lt;li&gt;Predicts ride demand across multiple city zones&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🕸️ Graph Neural Networks
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Models the city as a graph&lt;/li&gt;
&lt;li&gt;Nodes = city zones&lt;/li&gt;
&lt;li&gt;Edges = historical trip flows&lt;/li&gt;
&lt;li&gt;Captures spatial mobility patterns that traditional models miss&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🤖 Reinforcement Learning Pricing
&lt;/h3&gt;

&lt;p&gt;Built a PPO-based surge pricing engine that learns pricing policies instead of relying on hand-crafted rules.&lt;/p&gt;

&lt;p&gt;Optimizes multiple objectives simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📈 Platform revenue&lt;/li&gt;
&lt;li&gt;🚕 Driver earnings&lt;/li&gt;
&lt;li&gt;😊 Rider welfare&lt;/li&gt;
&lt;li&gt;⏱️ Wait times&lt;/li&gt;
&lt;li&gt;⚖️ Fairness constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One interesting finding:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The RL agent learned to gradually increase surge prices instead of aggressively reacting to demand spikes. This behavior wasn't explicitly programmed.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  ⚡ Kafka-Style Real-Time Streaming
&lt;/h3&gt;

&lt;p&gt;Implemented an event-driven architecture with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ride request streams&lt;/li&gt;
&lt;li&gt;Driver status updates&lt;/li&gt;
&lt;li&gt;Pricing events&lt;/li&gt;
&lt;li&gt;Match results&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Supports historical replay and live marketplace metrics.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧠 Driver State LSTM
&lt;/h3&gt;

&lt;p&gt;Predicts four operational driver states:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;online_idle&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;online_busy&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;relocating&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;offline&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built entirely in NumPy with Backpropagation Through Time and Adam optimization.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧪 Counterfactual A/B Testing
&lt;/h3&gt;

&lt;p&gt;Implemented production-style experimentation techniques:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IPS (Inverse Propensity Scoring)&lt;/li&gt;
&lt;li&gt;Doubly Robust Estimation&lt;/li&gt;
&lt;li&gt;CUPED variance reduction&lt;/li&gt;
&lt;li&gt;Bootstrap confidence intervals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows evaluating policies without deploying every experiment in production.&lt;/p&gt;




&lt;h3&gt;
  
  
  🗺️ Multi-Modal Transit Planning
&lt;/h3&gt;

&lt;p&gt;Journey planning across six transportation modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚗 Rideshare&lt;/li&gt;
&lt;li&gt;🚌 Bus&lt;/li&gt;
&lt;li&gt;🚇 Subway&lt;/li&gt;
&lt;li&gt;🚲 Bike&lt;/li&gt;
&lt;li&gt;🛴 Scooter&lt;/li&gt;
&lt;li&gt;🚶 Walking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Uses A*/Dijkstra optimization to balance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Travel time&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;li&gt;CO₂ emissions&lt;/li&gt;
&lt;li&gt;Number of transfers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💡 What I Learned
&lt;/h2&gt;

&lt;p&gt;The hardest problem isn't maximizing revenue.&lt;/p&gt;

&lt;p&gt;It's maximizing revenue &lt;strong&gt;while remaining fair&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Without constraints, optimization naturally prioritizes high-demand areas and disadvantages low-supply neighborhoods.&lt;/p&gt;

&lt;p&gt;Adding fairness fundamentally changes the optimization landscape.&lt;/p&gt;

&lt;p&gt;Some other takeaways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;RL discovers strategies humans don't explicitly program.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GNNs capture spatial relationships that tabular models miss.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Causal inference is essential for policy evaluation.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pure NumPy is more powerful than people think.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🛠️ Tech Stack
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Python · Streamlit · Plotly · Stable-Baselines3 · NetworkX · NumPy · Scikit-Learn · Gymnasium&lt;/code&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔮 What's Next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Graph Attention Networks (GAT)&lt;/li&gt;
&lt;li&gt;[ ] Multi-Agent Reinforcement Learning&lt;/li&gt;
&lt;li&gt;[ ] Real Kafka Broker Integration&lt;/li&gt;
&lt;li&gt;[ ] WebGL City Visualization&lt;/li&gt;
&lt;li&gt;[ ] Real-World Dataset Integration (NYC TLC, Chicago Divvy)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔗 GitHub
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/kh-bikash/ubersim" rel="noopener noreferrer"&gt;https://github.com/kh-bikash/ubersim&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback, ideas, and contributions are welcome 🚀&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
