The Honest Truth
I break things. A lot.
I've deployed code when my server disk was 99% full. I've promoted broken canaries without checking if they were actually working. I've made the same mistakes over and over.
So I built a tool that literally won't let me be stupid.
It's called SwiftDeploy. And this is the story of how I built it.
What Does This Tool Actually Do?
In simple terms:
- You write ONE file describing your app
- The tool generates everything else (Nginx config, Docker files)
- Before deploying, it asks permission from a policy engine
- If your disk is too full or CPU is too high → deployment blocked
- If your canary has too many errors → promotion blocked
- You get a live dashboard showing what's happening
- You get an audit report showing what happened
Think of it like a security guard at the door who checks your ID before letting you in.
The Architecture (Simple Picture)
Think of it like this:
You edit manifest.yaml
↓
swiftdeploy CLI reads it
↓
┌───┼───┐
↓ ↓ ↓
nginx Docker OPA
.conf compose (policy engine)
The CLI asks OPA before doing anything important. OPA says YES or NO with a reason. That's it.
The One File You Actually Edit
All I ever touch is manifest.yaml. Everything else is automatic.
app:
name: swift-deploy-1
mode: stable
services:
image: nneoma-swiftdeploy:latest
port: 3000
nginx:
port: 8090
That's it. Three sentences. The tool handles the rest.
The Eyes: Watching Everything
My API now has a /metrics endpoint. It's like a health tracker for your app.
It tells me:
- How many people are using my app
- How many errors are happening
- How slow the responses are
- How long the app has been running
Here's what it actually looks like:
http_requests_total{method="GET",status_code="200"} 42
app_uptime_seconds 67108
app_mode 1
chaos_active 0
Boring? Yes. Useful? Absolutely.
The Brain: Asking Permission Before Doing Anything Dumb
Here's where it gets clever.
I added something called Open Policy Agent (OPA). It's just a tiny program that answers one question: "Is it safe to do this?"
Before Deploying
I ask OPA: "Hey, is my server healthy enough for a deployment?"
I send my disk space, CPU load, and memory.
OPA checks the rules and says YES or NO. If NO, it tells me WHY.
Before Promoting a Canary
I ask a different question: "Is my canary version actually working?"
I send the current error rate and how slow the responses are.
OPA blocks me if errors are over 1% or responses take longer than 500ms.
The Rules (Written Like Plain English)
The rules are easy to read. Here's the infrastructure rule:
Allow deployment if:
- Disk space is at least 10GB
- CPU load is under 2.0
- Memory is at least 10% free
If disk is too full, say: "Disk free below minimum"
If CPU is too high, say: "CPU load exceeds maximum"
Here's the canary rule:
Allow promotion if:
- Error rate is under 1%
- P99 latency is under 500ms
If errors are too high, say: "Error rate exceeds 1%"
If latency is too high, say: "P99 latency too high"
The thresholds aren't buried in code. They live in a separate file. I can change them without touching the rules.
The Commands You Actually Type
# Generate all the config files
./swiftdeploy init
# Check if everything is ready
./swiftdeploy validate
# Deploy the whole thing
./swiftdeploy deploy
# Switch to canary mode (gets checked first)
./swiftdeploy promote canary
# Switch back to stable
./swiftdeploy promote stable
# See what's happening right now
./swiftdeploy status
# Get a report of everything that happened
./swiftdeploy audit
# Turn everything off
./swiftdeploy teardown
The Dashboard (What You See When You Run status)
==================================================
SwiftDeploy Status Dashboard
==================================================
[Requests] Total: 22 | Errors: 0 | Error Rate: 0.00%
[Host] Disk: 9.45GB | CPU: 0.27 | Mem: 76.46%
[Infrastructure Policy] ✗ FAIL
- Disk free (9.5GB) is below minimum (10.0GB)
[Canary Safety Policy] ✓ PASS
It updates live. I can see exactly which rule is failing and why.
The Hard Gate (When I Tried to Break It On Purpose)
I filled up my disk until only 9.45GB was free. Then I tried to deploy:
$ ./swiftdeploy deploy
[swiftdeploy] Checking pre-deploy policy...
Disk: 9.45GB free, CPU: 0.27, Mem: 76.46%
[BLOCK] Infrastructure policy failed:
- Disk free (9.5GB) is below minimum (10.0GB)
[swiftdeploy] Deploy blocked by policy.
The deployment was blocked. No damage. No panic. Just a clear message telling me exactly what was wrong.
This is the whole point. The tool won't let me break things.
The Isolation Test (Making Sure OPA Is Hidden)
OPA needs to be reachable by my CLI but NOT by the public. I tested it:
$ curl http://34.46.53.225:8090/v1/data
404 Not Found
Public users can't see OPA. No one can query my policies or see my thresholds. That's how it should be.
The Audit Report (For When Your Boss Asks "What Happened?")
Running ./swiftdeploy audit gives me a clean markdown file:
# SwiftDeploy Audit Report
Generated: 2026-05-06 18:43:08 UTC
## Timeline
- 2026-05-06T18:27:09Z: deploy (success)
- 2026-05-06T18:27:22Z: promote (success)
## Policy Violations
- `2026-05-06T18:43:08Z` Infrastructure policy failed
Now when someone asks "What broke at 3am?" I have an answer.
The Chaos Test (Injecting Errors On Purpose)
I added a chaos endpoint for testing. In canary mode, I can make things fail on purpose:
# Make every third request fail
curl -X POST http://localhost:8090/chaos \
-d '{"mode": "error", "rate": 0.3}'
# Make requests slow (2 second delay)
curl -X POST http://localhost:8090/chaos \
-d '{"mode": "slow", "duration": 2}'
# Turn chaos off
curl -X POST http://localhost:8090/chaos \
-d '{"mode": "recover"}'
When I injected errors, the dashboard immediately showed the canary policy failing. Promotion was blocked. Everything worked as expected.
What I Learned
One source of truth saves your sanity.
Editing one file is way better than managing five different config files. Nothing gets out of sync.
Keep policy separate from code.
I can change deployment rules without touching the app. Security can update thresholds. Different environments can have different rules.
Metrics make invisible problems visible.
Without metrics, I was guessing. With metrics, I know exactly what's happening.
Fail fast. Fail loudly.
Blocking a broken deployment with a clear error message is much better than deploying and finding out later.
Audit trails aren't just for compliance.
They're for debugging. When something breaks, I have a complete timeline.
How You Can Try This Yourself
# Clone the repo
git clone https://github.com/Ada-Mazi/swiftdeploy
cd swiftdeploy
# Build the app
docker build -t nneoma-swiftdeploy:latest app/
# Deploy everything
./swiftdeploy deploy
# Check if it's working
curl http://localhost:8090/healthz
# See the dashboard
./swiftdeploy status
# View the metrics
curl http://localhost:8090/metrics
Live Demo
- Dashboard: Check the status
- Metrics: View the raw metrics
SwiftDeploy
A declarative CLI tool that generates Nginx and Docker Compose configs from a single manifest.yaml and manages the full container lifecycle.
Prerequisites
- Docker installed
- Python 3.10+
- jinja2 and pyyaml installed
Install dependencies:
pip3 install jinja2 pyyaml
Quick Start
git clone https://github.com/Ada-Mazi/swiftdeploy
cd swiftdeploy
pip3 install jinja2 pyyaml
docker build -t swift-deploy-1-node:latest app/
./swiftdeploy deploy
Subcommands
init
Parses manifest.yaml and generates nginx.conf and docker-compose.yml
./swiftdeploy init
validate
Runs 5 pre-flight checks
./swiftdeploy validate
Checks:
- manifest.yaml exists and is valid YAML
- All required fields present and non-empty
- Docker image exists locally
- Nginx port is not already bound
- Generated nginx.conf is syntactically valid
deploy
Builds image, starts stack, waits for health checks
./swiftdeploy deploy
promote
Switches mode with rolling restart
./swiftdeploy promote canary
./swiftdeploy promote stable
teardown
Removes all containers, networks, volumes
./swiftdeploy teardown
./swiftdeploy teardown --clean
API Endpoints
- GET / welcome message with mode, version, timestamp
- GET /healthz liveness check with…
Final Thoughts
Building this was hard. But now I have a tool that:
- Generates everything from one file
- Watches my metrics
- Blocks bad deployments
- Shows me a live dashboard
- Gives me an audit trail
And most importantly, it stops me from breaking things at 2am.
That's a win.
what is your 2AM depolyment horror story?
Top comments (0)