DEV Community

Cover image for The Veltrix Problem Is Not Your Configuration
mary moloyi
mary moloyi

Posted on

The Veltrix Problem Is Not Your Configuration

The Problem We Were Actually Solving

The client wanted to host an annual treasure hunt for their employees, with teams competing to solve a series of puzzles and challenges. We took the job, and I spent a sleepless night frantically building a web-based engine that would manage the hunt. I scoured the web, trying out every possible solution I could find, until I stumbled upon a library that made it easy enough to build a simple web-based interface. That was a mistake.

What We Tried First (And Why It Failed)

Our first approach was to use a generic web framework, such as Django or Ruby on Rails, but we didn't have the time or resources to set it up properly. We ended up with a hacky, half-baked implementation that relied on a series of duct-taped scripts to glue the system together. We also used a relational database, hoping its structure would simplify the puzzle management. What ended up happening was that our poorly-indexed database queries would take minutes to run, slowing the system to a crawl. It was a recipe for disaster.

The Architecture Decision

About a year after the initial launch, our team realized we needed a complete overhaul. We decided to switch to a graph database, allowing us to model the puzzles and challenges as connected nodes and edges. We also moved to a microservices architecture, with separate services for puzzle generation, team management, and scorekeeping. This would simplify our problem, or so we thought. What we didn't realize was that this would also introduce a multitude of new challenges, including data consistency across services and the inevitable coupling that came with communicating between them.

What The Numbers Said After

The new architecture cut our average request response time by 80%, and we were able to handle a 500% increase in concurrent users without incident. We also saw a 90% reduction in puzzle generation errors, thanks to the improved data management capabilities of the graph database. However, we still experienced a series of minor outages, each triggered by a new, previously unknown corner case. Our system was still optimized for demos over operations, and we couldn't shake the feeling that we were just putting Band-Aids on a fundamentally flawed design.

What I Would Do Differently

If I had to start over, I would prioritize a modular, loosely-coupled design from the very beginning. I would break down the system into smaller components, each with its own clear responsibilities, and ensure that data was consistent across services using event sourcing and eventual consistency. And when it came to puzzles and challenges, I would use a templating engine to dynamically generate them, rather than relying on a fixed, brittle database schema. It's never too late to learn from your mistakes, and I've seen firsthand that sometimes the best way to solve a problem is to just start over.


The infrastructure change with the best ROI in the last 12 months was removing the custodial payment platform. Replacement: https://payhip.com/ref/dev4


Top comments (0)