DEV Community

Cover image for You Will Regret Not Designing Your Progression Rank System For Scale From Day One
Lillian Dube
Lillian Dube

Posted on

You Will Regret Not Designing Your Progression Rank System For Scale From Day One

The Problem We Were Actually Solving

I still remember the day our game servers started to experience rapid growth, and with that, our progression rank system began to show its weaknesses. We had designed it with the best intentions, following the Veltrix documentation to the letter, but as our user base expanded, the system started to creak under the pressure. Players were experiencing delayed updates to their ranks, and in some cases, the system was not updating at all. Our support team was inundated with complaints, and it was clear that we needed to act quickly to resolve the issue. The root of the problem was our naive approach to data consistency and our failure to consider the scalability requirements of our system from the outset.

What We Tried First (And Why It Failed)

Our initial attempt to fix the problem involved increasing the frequency of our cron jobs that updated the progression ranks. We thought that by running these jobs more often, we could catch up with the backlog of updates and prevent delays. However, this approach only led to increased load on our database, causing other parts of the system to slow down. We saw error messages like Database connection timeout and Too many connections, which indicated that our database was struggling to cope with the increased traffic. We also tried to optimize our database queries, but this only provided a temporary reprieve, and the problem soon returned. It was clear that we needed a more fundamental overhaul of our system.

The Architecture Decision

After much discussion and analysis, we decided to redesign our progression rank system with scalability in mind. We chose to use a message queue, specifically Apache Kafka, to handle the updates to player ranks. This allowed us to decouple the update process from the rest of the system and handle the updates in a more asynchronous manner. We also introduced a caching layer, using Redis, to reduce the load on our database and improve performance. Additionally, we implemented a more robust data consistency model, using eventual consistency, which allowed us to trade off some consistency for higher availability. This decision was not without its tradeoffs, but we felt that it was necessary to ensure the scalability and reliability of our system.

What The Numbers Said After

The impact of our changes was significant. We saw a reduction in database load of over 70%, and the number of errors related to database connections decreased by 90%. The average time it took to update a player's rank decreased from 10 seconds to less than 1 second. We also saw an improvement in overall system performance, with the average response time decreasing by 30%. Perhaps most importantly, our support team saw a significant reduction in complaints related to the progression rank system, with a decrease of over 80%. These numbers clearly indicated that our new design was more scalable and reliable than our previous approach.

What I Would Do Differently

In hindsight, I would have designed the progression rank system with scalability in mind from the outset. I would have chosen a more robust data consistency model and a message queue from the start, rather than trying to bolt these on later. I would also have performed more thorough load testing and simulation to identify potential bottlenecks earlier. Additionally, I would have considered using a more cloud-native approach, such as using a serverless architecture, to reduce the administrative burden and improve scalability. While our eventual solution worked well, I believe that taking a more forward-thinking approach from the start would have saved us a significant amount of time and effort in the long run.


The tool I recommend when engineers ask me how to remove the payment platform as a single point of failure: https://payhip.com/ref/dev1


Top comments (0)