I Built UberSim v2.0: A Production-Grade Urban Mobility Intelligence Platform ๐๐ง
Every time you open Uber and see a 2.1ร surge multiplier, a complex system has already predicted demand, optimized prices, matched drivers, and logged events for future learning โ all within milliseconds.
I wanted to understand how those systems work.
So I built UberSim v2.0.
A Python-based urban mobility intelligence platform that simulates the core engineering challenges behind modern ride-sharing marketplaces.
Instead of building another dashboard project, I wanted to recreate the intelligence layer behind a ride-sharing platform from scratch.
๐ What's Inside?
๐ง Demand Forecasting
- Spatio-temporal demand prediction (Rยฒ = 0.89)
- Weather effects, seasonality, lag features, and neighboring zone influence
- Predicts ride demand across multiple city zones
๐ธ๏ธ Graph Neural Networks
- Models the city as a graph
- Nodes = city zones
- Edges = historical trip flows
- Captures spatial mobility patterns that traditional models miss
๐ค Reinforcement Learning Pricing
Built a PPO-based surge pricing engine that learns pricing policies instead of relying on hand-crafted rules.
Optimizes multiple objectives simultaneously:
- ๐ Platform revenue
- ๐ Driver earnings
- ๐ Rider welfare
- โฑ๏ธ Wait times
- โ๏ธ Fairness constraints
One interesting finding:
The RL agent learned to gradually increase surge prices instead of aggressively reacting to demand spikes. This behavior wasn't explicitly programmed.
โก Kafka-Style Real-Time Streaming
Implemented an event-driven architecture with:
- Ride request streams
- Driver status updates
- Pricing events
- Match results
Supports historical replay and live marketplace metrics.
๐ง Driver State LSTM
Predicts four operational driver states:
online_idleonline_busyrelocatingoffline
Built entirely in NumPy with Backpropagation Through Time and Adam optimization.
๐งช Counterfactual A/B Testing
Implemented production-style experimentation techniques:
- IPS (Inverse Propensity Scoring)
- Doubly Robust Estimation
- CUPED variance reduction
- Bootstrap confidence intervals
This allows evaluating policies without deploying every experiment in production.
๐บ๏ธ Multi-Modal Transit Planning
Journey planning across six transportation modes:
- ๐ Rideshare
- ๐ Bus
- ๐ Subway
- ๐ฒ Bike
- ๐ด Scooter
- ๐ถ Walking
Uses A*/Dijkstra optimization to balance:
- Travel time
- Cost
- COโ emissions
- Number of transfers
๐ก What I Learned
The hardest problem isn't maximizing revenue.
It's maximizing revenue while remaining fair.
Without constraints, optimization naturally prioritizes high-demand areas and disadvantages low-supply neighborhoods.
Adding fairness fundamentally changes the optimization landscape.
Some other takeaways:
- RL discovers strategies humans don't explicitly program.
- GNNs capture spatial relationships that tabular models miss.
- Causal inference is essential for policy evaluation.
- Pure NumPy is more powerful than people think.
๐ ๏ธ Tech Stack
Python ยท Streamlit ยท Plotly ยท Stable-Baselines3 ยท NetworkX ยท NumPy ยท Scikit-Learn ยท Gymnasium
๐ฎ What's Next?
- [ ] Graph Attention Networks (GAT)
- [ ] Multi-Agent Reinforcement Learning
- [ ] Real Kafka Broker Integration
- [ ] WebGL City Visualization
- [ ] Real-World Dataset Integration (NYC TLC, Chicago Divvy)
๐ GitHub
https://github.com/kh-bikash/ubersim
Feedback, ideas, and contributions are welcome ๐
Top comments (0)