I Built a Tool That Builds My Infrastructure — Here's How It Went
A brutally honest account of building SwiftDeploy for the HNG14 Stage 4A DevOps challenge
When I first read the Stage 4A task brief, one line jumped out at me:
"Most DevOps tasks ask you to configure infrastructure manually — this one asks you to build the tool that does it for you."
That single sentence changed how I approached the entire challenge. This wasn't about setting up servers or writing config files by hand. It was about building something that does all of that for you, from a single source of truth.
This is the story of how I built SwiftDeploy — and every wall I hit along the way.
What Is SwiftDeploy?
SwiftDeploy is a declarative deployment CLI tool. You describe your entire infrastructure in a single manifest.yaml file, and the tool generates your Nginx config, Docker Compose file, manages your container lifecycle, and keeps your stack healthy.
The stack consists of:
- A FastAPI Python service that runs in either
stableorcanarymode - An Nginx reverse proxy that routes all traffic, logs every request, and returns JSON error responses
- A CLI tool written in Python with five subcommands:
init,validate,deploy,promote, andteardown - Everything generated from Jinja2 templates — no manually written config files allowed
The grader would delete my generated files and re-run swiftdeploy init to verify everything regenerates correctly. If the tool broke, the stack broke. No shortcuts.
The Architecture
Before writing a single line of code, I mapped out how everything would connect:
manifest.yaml → swiftdeploy init → nginx.conf + docker-compose.yml
↓
docker-compose up
↓
[nginx:8080] → [app:3000]
The manifest.yaml is the only file a human ever edits. Everything else is derived from it. That constraint is what makes the tool interesting — and what made debugging it so painful at times.
Building the API Service
The API service is a FastAPI application with three endpoints:
GET / returns a welcome message including the current mode, version, and server timestamp.
GET /healthz returns a liveness check with process uptime in seconds — used by Docker's health check system to determine if the container is ready to serve traffic.
POST /chaos is the interesting one. It's only active in canary mode and lets you simulate degraded behaviour: slow responses, random 500 errors, or a full recovery. This is the kind of endpoint that makes canary deployments genuinely useful — you can test how your system behaves under failure before rolling it out to everyone.
Canary mode also adds an X-Mode: canary header to every response, so you can always tell which mode the service is running in just by inspecting the headers.
Building the CLI
The CLI is a single Python script with no external framework — just argparse-style argument handling, PyYAML for parsing the manifest, and Jinja2 for rendering templates.
The five subcommands each have a clear responsibility:
init reads the manifest and renders both templates. Simple, fast, deterministic.
validate runs five pre-flight checks before anything is deployed. It checks that the manifest exists and is valid YAML, that all required fields are present, that the Docker image exists locally, that the Nginx port is free on the host, and that the generated nginx.conf passes a syntax check.
deploy chains init and validate together, then brings up the stack and blocks until health checks pass — or times out after 60 seconds.
promote is the most complex command. It updates the mode field in manifest.yaml in-place, regenerates docker-compose.yml with the new MODE environment variable, restarts only the app container (not nginx), and then confirms the new mode is active by hitting /healthz.
teardown brings everything down cleanly. With --clean, it also deletes the generated config files.
The Challenges — And There Were Many
I want to be honest here. This project did not go smoothly. Here is every wall I hit, in order.
The folder was named wrong
My templates folder was named template — without the s. The CLI was looking for templates/nginx.conf.j2 and kept throwing a TemplateNotFound error. I spent more time than I'd like to admit staring at that error before noticing the missing letter.
The file was named wrong too
Once the folder name was fixed, the nginx config template was named nginx.config.j2 instead of nginx.conf.j2. Config versus conf — four characters making the whole thing fail silently.
Windows doesn't have chmod
Running chmod +x swiftdeploy in PowerShell throws an error. On Windows, you just run python swiftdeploy <command> directly — no permissions needed. This caught me off guard because the instructions assumed a Linux environment.
The Dockerfile wasn't saving
This one was the most frustrating. I edited the Dockerfile in VSCode multiple times, but the changes weren't persisting. The tab showed unsaved changes that I kept missing. Every docker build was using the old version of the file, and pip install was installing nothing because the COPY paths were wrong.
The fix was bypassing VSCode entirely and writing the file content directly from the PowerShell terminal using Out-File. Once the file was written programmatically, the builds started working correctly.
app/requirements.txt was empty
The requirements.txt inside the app/ folder was created but had no content — completely empty. Because pip install on an empty file succeeds without error, the container built cleanly but had no packages installed. fastapi and uvicorn were both missing, and the container crashed on startup with No module named uvicorn.
I only caught this by running docker run --rm swift-deploy-1-node:latest pip list and seeing nothing but pip in the output. The fix was writing the dependencies directly from the terminal:
"fastapi==0.111.0`nuvicorn[standard]==0.29.0" | Out-File -FilePath app\requirements.txt -Encoding utf8
Docker kept caching broken layers
Even after fixing the Dockerfile and the requirements file, Docker kept serving the old cached image. The fix was force-removing the image entirely and rebuilding with --no-cache:
docker rmi -f swift-deploy-1-node:latest
docker build --no-cache -t swift-deploy-1-node:latest .
Port 3000 was already allocated
My Stage 2 project containers were still running in the background and had port 3000 allocated. Every attempt to test the app container on port 3000 failed with Bind for 0.0.0.0:3000 failed: port is already allocated. The fix was simply stopping the Stage 2 frontend container temporarily.
Nginx upstream validation broke pre-deployment
The validate command tests nginx syntax by spinning up a temporary nginx container. But before the stack is running, the app hostname doesn't exist on any Docker network, so nginx reports host not found in upstream. This looks like a failure but is completely expected — the config is syntactically correct, the hostname just doesn't resolve yet.
The fix was updating the validate logic to treat this specific error as a pass, not a failure. Any other nginx error would still cause validation to fail.
The Moment It Worked
After all of that, here is what the final deploy output looked like:
▶ swiftdeploy deploy
✔ nginx.conf generated
✔ docker-compose.yml generated
✔ manifest.yaml exists and is valid YAML
✔ All required manifest fields present and non-empty
✔ Docker image exists locally: swift-deploy-1-node:latest
✔ Nginx port 8080 is free
✔ nginx.conf is syntactically valid
✔ All checks passed — stack is ready to deploy
➜ Bringing up the stack…
✔ Container swiftdeploy-app-1 Healthy
✔ Container swiftdeploy-nginx-1 Started
✔ Stack is healthy → http://localhost:8080
And hitting http://localhost:8080 in the browser returned:
{
"message": "Welcome to SwiftDeploy API — running in stable mode",
"mode": "stable",
"version": "1.0.0",
"timestamp": "2026-05-02T17:44:59.804140+00:00"
}
Then promoting to canary:
▶ swiftdeploy promote canary
✔ manifest.yaml updated → mode: canary
✔ docker-compose.yml regenerated
➜ Restarting app container…
✔ Service healthy after promote → http://localhost:8080/healthz
➜ Active mode confirmed: canary
That moment — seeing Active mode confirmed: canary in the terminal — felt genuinely satisfying after everything it took to get there.
What I Learned
Declarative infrastructure is powerful but unforgiving. When the manifest is the single source of truth, every typo and every wrong path has consequences. But when it works, the elegance is undeniable — one file describes everything.
Always verify your file saves. On Windows especially, VSCode unsaved changes are easy to miss. When something isn't working despite your edits, verify the file content from the terminal before assuming the code is wrong.
Docker caching is a double-edged sword. It speeds up builds dramatically, but when you're debugging image content, it can hide your fixes behind stale layers. --no-cache should be your first instinct when something is inexplicably wrong.
Empty files fail silently. An empty requirements.txt is not an error — it's a valid file with no dependencies. Always verify what's actually inside your files, not just that they exist.
Pre-flight validation saves deployments. The five checks in the validate command caught real problems before they reached production. The nginx upstream check in particular required nuanced handling — not every nginx error is a real error.
Final Thoughts
SwiftDeploy is not a perfect tool. But it works. It deploys a full stack from a single manifest, handles canary deployments with a single command, and validates itself before touching anything.
More importantly, every challenge I hit while building it taught me something real about how infrastructure tools work — and why the details matter so much.
If you're working through HNG14 or any similar programme, my advice is simple: document your failures as carefully as your successes. The graders can tell the difference between someone who got fortunate and someone who actually understands what they built.
Good fortune for you out there. 🚀
Here's the repo incase it interest you to clone and replicate
Github:https://github.com/ntonous/hng14-stage4-taask.git
Built with Python, FastAPI, Nginx, Docker,Jinja2 and a lot of patience.
HNG14 Stage 4A — DevOps Track
Top comments (0)