Uriel Bitton

Posted on Jan 31

Designing The Data Model For A Ride Sharing Platform With DynamoDB

#dynamodb #aws #database #datamodel

How does Uber and Lyft design their database for maximum efficiency?

Would you like to know?

Me too, but here is my take on an efficient design using DynamoDB and the single table design.

The general aim of this ride-sharing platform challenge is the following:

We want to store data about riders, drivers, rides, payments, and user ratings, and be able to design the data models in such a way that we can efficiently query related data together — ideally in as little queries as possible.

Let’s take a look at the entity relationship diagram for our application, below.

Here’s the diagram above in summary:

A rider can request one ride from multiple available (nearby) drivers
A driver can provide a ride to one rider
After a ride is complete, the rider pays the driver
The rider can also add a rating to the driver

Here are a few important access patterns we will consider:

Get a driver/rider by userID
List active drivers in a given region
Get current ride for a driver or rider
Get ride history for a rider or driver
List active rides in a given region
Get payment details for a given ride
Get all ratings for a given driver

Now that we know our main access patterns, let’s try to come up with a table design that will satisfy these access patterns.

This type of use case requires a careful approach to organizing data items and leveraging composite primary keys to efficiently retrieve data.

We’ll proceed with the following plan:

The single table will hold as many primary access patterns as possible (efficiently) this will be done with primary key overloading.
The primary value of this (and any) ride-sharing app is allowing users (riders) to find drivers nearby and give them details about their ride.

With these points in mind, we can focus on what is most important for our base table and leave the rest for GSIs.

Table Schema

Our base table will use “pk” as the partition key and “sk” as the sort key.

We also use “entityType” to easily see what type of entity an item is.

Let’s explain the table data model above:

A user is either a rider or a driver. They can have metadata type items — such as name, type (rider or driver), contact info, etc.

A region partition can hold active driver items, these store driver-specific attributes like vehicule info, driver name, contact info, etc.

A ride partition can hold ride metadata such as duration of the ride, payment details and other related info.

Let’s start implementing our access patterns.

1. Get a driver or rider by userID

By providing the driver’s or rider’s userID we can fetch their profile using the base table pk and sk.

Query:

pk = "RIDER#101" AND sk = "RIDER#101"

This will give us static details of the driver like name, car make, etc.

2. List active drivers in a given region

By providing the region identifier (zip code or geo-spatial grid ID) we can get all active drivers inside it.

pk = "REGION#A" AND begins_with(sk, "DRIVER#")

All active drivers are stored within a region partition with the ACTIVE prefix, so using sk begins_with “ACTIVE” will get us all available drivers.

The caveat here is everytime a driver moves out of their grid, you have to use a transaction to delete and recreate the driver item in the new region partition.

3. Get current ride for a driver or rider

To get the ride details for a given driver/rider, we can use the userID as the partition and “RIDE” prefix to get all rides. We then use a filter for active rides only.

pk = "RIDER#102" AND begins_with(sk, "RIDE#")
FilterExpression: status = "active"

This returns the two items from our table below.

4. Get ride history for a rider or driver

This is the same as the above, we just don’t add a filter expression.

pk = "RIDER#102" AND begins_with(sk, "RIDE#")

This returns the two items from our table below.

5. List active rides in a given region

Here we can use the “REGION” prefix and region identifier for the pk, and simply “RIDE” prefix for the sort key.

We also need to supply the active filter to get only active rides.

pk = "REGION#A" AND begins_with(sk, "RIDE#")
FilterExpression: status = "active"

This returns the three items from our table below.

6. Get payment details for a given ride

We can provide the ride ID and get the payment details.

Since it is the rider that pays the driver, we can store payment details directly in a ride item or as it’s own item.

But we can more efficiently store it as its own item within the ride partition. Since we know the rider who needs to make the payment as well as the right, we can easily get the payment item like so:

pk = "RIDER#101" AND begins_with(sk, "RIDE#502#PAYMENT#")

This returns the following item from our table below.

7. Get all ratings for a given driver

Getting ratings for a driver is similar to getting payments above.

pk = "DRIVER#201" AND begins_with(sk, "RATING#")

This returns the following item from our table below.

Remember, we have covered a few of the main access patterns here, if we have more access patterns to satisfy we can create GSIs for that.

If you would like to learn how to work with GSIs, I encourage you to read this article.

Summary

In this article, I explore how to design a DynamoDB table for a ride-sharing app using a single table layout that efficiently supports complex queries.

By modeling entities like riders, drivers, rides, payments, and ratings in a unified table structure, the design enables retrieval of related data with minimal queries.

I also go over the details of table schemas and DynamoDB queries to handle key access patterns such as finding nearby drivers, retrieving ride history, and obtaining payment details.

👋 My name is Uriel Bitton and I’m committed to helping you master AWS, serverless and DynamoDB.

🚀 Build the database your business needs with DynamoDB — subscribe to my email newsletter: https://excelling-with-dynamodb.beehiiv.com/subscribe.

☕️ Need help with DynamoDB? Book a 1:1 with me here: https://calendly.com/urielas1/dynamodb-consultations.

Thanks for reading and see you in the next one!

Built for developers, by developers.

Whether you're building a simple prototype or a business-critical product, Heroku's fully-managed platform gives you the simplest path to delivering apps quickly — using the tools and languages you already love!

Learn More

DEV Community