Today we want to talk about the downtime and errors some users may have experienced while visiting our marketing website during the past two days. We would like to explain what happened, how we fixed it, and what we learned from the experience.
During the last two days, we migrated our marketing website to our new Hetzner servers, as announced in Future Friday #2: New Servers. While carrying out the migration, we made a few mistakes that resulted in temporary issues on the website.
The migration was divided into two parts. First, we set up the new infrastructure. Second, we migrated all required data from the old database. The main problems occurred during the infrastructure setup.
We encountered issues with communication between Docker containers. Unfortunately, it took longer than expected to identify the cause because we were troubleshooting late at night. After investigating several possibilities, we discovered that the problem was a small configuration mistake.
A missing allowed host entry caused backend API requests to fail whenever the frontend attempted to communicate with the backend. While the fix itself was simple, finding the exact source of the problem was difficult because the relevant configuration was spread across multiple files and locations.
We also ran into a few smaller issues. One example was an incorrect file permission that prevented the QueueForge logo from loading correctly. There were also several other minor configuration errors that needed to be resolved during the migration process.
Although the marketing website is not our main product, we take reliability seriously. Every issue is an opportunity to improve our processes and infrastructure. We want to learn these lessons now so we can provide a stable and reliable experience when we launch our first product.
What We Learned
- Test infrastructure changes in a dedicated environment before deployment.
- Keep configuration files organized and well documented.
- Create deployment checklists for recurring tasks.
- Verify file permissions as part of every migration.
- Improve monitoring and error reporting to detect issues faster.
While no migration goes perfectly, every challenge helps us improve. We appreciate your patience and understanding while we continue building a reliable platform.
The QueueForge Team
Top comments (0)