DEV Community

Rohith Kunnath
Rohith Kunnath

Posted on

Making Share My Trip Feature much more robust.

Working with one of the top urban mobility providers in Germany was definitely a High Point in my career. A lot of learning and unlearning happens on every career switch. One of them for me was refactoring or rewriting the Share My Trip feature.

What is the Share My Trip feature?

Whoever takes a ride in the App, can share the entire Trip route, plus information about the vehicle they are riding on via a Web URL. This is normally used to inform someone expecting your arrival about your current location. However, this can be also regarded as a Security feature in our fast-moving world.

Old Architecture

Legacy Arch

The old architecture was a microservice serving users real-time data collected from multiple data sources. The ratio of cons to pros was much higher in this design. Some of the cons caught my eyes and are listed below.
Increasing the chances of “Distributed Denial of Service Attack(DDoS)”.
Since the URL is public which queries the big legacy monolith database with a lot of table joins, external driver service API calls, and who knows how many more things are going to be added there in the future. I would argue it is easy to trigger a DDoS Attack.
Inefficient Resource Usage.
Location Service stores the data in Distributed Redis Clusters and Redis eats nothing but the memory itself. As a company with thousands of transports on the road, we are talking about location updates at the rate of100k/min. Some features like ShareMyTrip making use of this resource every time is an Inefficient Resource Usage.
Questioning “Availablity” and “Reliability” at the same time.
Making our service available to our end users 99.9% percent of the time and being reliable to our end-users depends on the health of our mothership(the legacy database), a lot of microservices, and the monolith service which connects to the database.

New Architecture

New Arch

The new architecture works purely based on Events. A hybrid model of Pub/Sub model and Producer/Consumer model is used.

Stack:

Database - Mongo
Why? Mongo database helps us to easily mitigate schema changes from a lot of independent services. Similarly helps us to maintain a TTL for each booking.
Service - Spring boot service.
Producer/Consumer - Rabbimq.
Pub/Sub - Redis.

Understanding this would be easy if I starts explaining a User Journey.

  1. A user initiates the search for a Cab to travel from Position A to Position B.
  2. Booking Service receives the request and creates publishes an event to the Rabbitmq via Fanout exchange.
  3. Our new service holds consumers which listen to the above events and save local copies to our Database.
  4. Once a driver is available and is allocated to the booking, the Driver Service publishes another event to Rabbitmq via Fanout exchange.
  5. Our service updates the Driver Information for the respective booking with a TTL of estimated driving time plus a buffer of 15 minutes.
  6. At the same time, our service will create a new subscription on Location Service for the respective Driver Id. This will help us to not receive all the location updates from every driver but only from drivers who are riding now.
  7. For every further booking update, we receive an event from Booking Service and the same is updated in our database.
  8. From the moment we receive the Booking event with the status “PASSENGER_CARRY”, the API responds with 200 status and respective body content.
  9. From the moment we receive the Booking event with the status “PASSENGER_DROPPED”, the Redis Subscription for the Driver Id is deleted and the TTL for the booking will be set to 5mins. This is to make sure the data only exists for 5 more mins for anyone to track.

Learnings:

Make sure you have a max limit in the Queues to avoid bursting up of RabbitMq in case your consumers are Errored out.
Sharding is a priority if you need to scale RabbitMq. This helps us to handle events much faster and perform better during our Daily Peak times (8 am — 10 am).
Using REST API within Event-based architecture is not a great idea. If it is unavoidable, then latency-related complexities, proper re-queuing techniques, and related Error handling need to be evaluated.
Create proper indexes in the document database.

Top comments (0)