Most web apps boil down to CRUD - Create, Read, Update, Delete.
But every now and then, a project comes along that breaks the monotony. Instead of just wiring up endpoints and database rows, you need to start thinking about systems — processes, configs, orchestration, resilience.
This post is a reflection on one such project I worked on. A client in the video re-streaming¹ space wanted a control dashboard for their customers — something simple on the surface:
"Let users toggle their streams on and off."
Sounds easy, right? Just a button, an API call, done.
Except… those toggles had to orchestrate system processes, job queues, config updates, and real-time feedback loops — all while hiding that complexity behind a clean web UI.
It turned out to be one of the most fascinating technical problems I’ve worked on. Let me walk you through the challenges we hit and the solutions we designed.
1. State Management: Database vs. System
Most web apps can treat the database as the ultimate source of truth. Not here.
The streaming process only respected its configuration file, which lived on disk. That file dictated whether a stream was active or not.
This raised a tricky problem:
- If the database said "stream = on" but the file said otherwise, who do we believe?
- How do we keep both in sync without introducing drift?
Approach
- We treated the configuration file as the source of truth.
- The API server became a translation layer:
user actions → update config file → reflect state in DB
. - On startup, the system reconciled the DB against the file to re-establish a consistent state.
This ensured that if the process restarted or a manual change occurred, we didn’t blindly trust stale DB values.
2. Multi-Step Orchestration
Toggling a stream wasn’t as simple as flipping a flag. The real sequence looked like this:
- Update the configuration file.
- Restart the streaming process.
- Wait until the process stabilized and confirmed readiness.
Each step could fail. For example:
- File write errors (permissions, disk issues).
- Process restart failure.
- Process enters a crash loop.
Approach
- We built an asynchronous job queue with retries and compensation.
- Each toggle request became a background job.
- Jobs moved through states:
PENDING → RUNNING → COMPLETED/FAILED
. - If a step failed, the job queue retried or rolled back gracefully (e.g., restore old config).
- The system ensured no two jobs interfered, keeping operations atomic from the user’s perspective.
This was crucial for stability—otherwise a single half-completed toggle could corrupt the live state.
3. Real-Time Feedback to Users
Users didn’t want to "fire and forget." They wanted to know if their toggle request succeeded.
A plain HTTP request wasn’t enough because the job was async. Returning "OK" immediately would lie — the process might still be restarting.
Approach
- We built a persistent WebSocket channel between server and client.
- When a toggle was requested, the client got back a
jobId
. - The UI subscribed over WebSocket to that job’s status.
- As the background job progressed, the server pushed updates (
in progress…
,completed
, orfailed
). - The UI displayed real-time status, so customers always knew the outcome.
This turned a frustrating black box into a transparent, transactional experience.
Sequence diagram of the workflow can be viewed here
4. Key Takeaways
This project was a reminder that:
- Source of truth matters. Sometimes it’s not the database.
- Asynchronous orchestration is unavoidable when dealing with external systems.
- Real-time feedback builds trust. Users don’t just want features—they want visibility into what’s happening.
- Architecture resilience > code cleverness. The hardest part wasn’t writing code, it was designing a system that wouldn’t collapse under edge cases.
Final Thoughts
What looked like a trivial feature—"toggle a stream"—actually turned into a system design problem, touching on state synchronisation, async orchestration, and real-time communication.
For me, this was one of the most rewarding technical challenges I’ve worked on. It taught me that the real engineering battles often happen at the intersection of web applications and the messy, stateful systems they control.
💬 I’d love to hear from others:
Have you ever faced a "simple" feature request that turned into a full-on system design challenge?
¹ Video re-streaming: The process of taking a live video feed from one source and re-broadcasting it to multiple platforms (e.g., streaming a live event to YouTube, Facebook, and Twitch simultaneously).
Top comments (0)