I currently have Discord Bot that uses MongoDB as a database.
I'm new to programming, so when I created it I didn't have the slightest notion about structuring, let alone think about database. But that's okay, what matters is that it worked, or should. Menhera (my bot) grew very fast, and went from 4k servers to 7k in the blink of an eye. The consequence of having too many servers is too many messages coming in at the same time.
Each command execution did about 3 queries in the database: To get server data, user data, and command data. If the command executed used more things like using the economy, it was more of a query to be exec. At first it was ok, but then the database started to crash. For you to have an idea, here is a screenshot of the API ping. And look, the API ping was literally a query to modify the ping data, so it was basically a simple update, and it took about 300ms to executeuted.
If that wasn't enough, with more than a thousand users using commands at the same time, the database simply stopped responding, causing the bot just stop. The next image shows how the queries per second were at that time
With that number of queries the bot stopped. I needed a solution for this, I couldn't let so many queries run. It was then that I decided to use Redis. That's what I needed, I could remove two queries with a cache, the server and command, since this information they rarely change, and when they do, it's little. Perfect, I started a container with Redis, and made the integration with the bot. Perfect, the problem is over. That's what I thought. From the 100 queries, it went down to 60 per second, it was better than before, but it still wasn't enough.
For the time being the bot was running normally, the two less queries caused it to decrease about 70ms in command response time. Of course that was great for now, but with the bot's growth it wouldn't be enough, the more servers the worse it would get.
That's when I found the hole in my code that probably made everything go wrong.
The bot has an AFK module that identifies when a user who was AFK sent a message. I need to see the user who sent the message first, regardless whether the user message was a command or not. But then there was the problem, I did a query to see if the user was AFK before EVERYTHING. Literally EVERYTHING. So EVERY message that any user sent was a query.
Now imagine, 7700 servers spamming messages where each message was a query, regardless of whether it was a command or a simple 'lol'. I was extremely frustrated with myself as this was literally what had screwed up. It was in front of me, above all the code, and I didn't realize it until months later. But then it was just a matter of solving it, creating a cache for afk. Every user message is a query yes, but for Redis now. So the database now only queries the user data if it is a command, and it does additional queries for the commands that need it.
With that, I reduced the queries per second to less than 16. That's what I needed. Now the bot is much faster, and the database is not overloaded.
It took me 1 week and 5 days to resolve this. The bot was literally offline for 12 days until I finally identify the error, and fix it.
As a bonus, besides the database problem, I still converted the entire code to TypeScript. Now just focus on the future and keep using and abusing this wonder that is Redis.
Hope this helps more people, and that it encourages using Redis for queries that are - mostly - repetitive. Redis improves not only the response time, but also the load put on the main database.
The poject that happened all of this is open source, and it is here
This was my first post here on dev.to, I wanna to use this to show problems that I faced as a beginner, and to help other beginners who go through the same problem.
Top comments (0)