Building a Self-Deploying Infrastructure Tool with OPA Policy Guards

#devops #ai #softwareengineering

Building a Self-Deploying Infrastructure Tool with OPA Policy Guards
What I Built and Why....

Author CHIMA_THE_NIGERIAN_SUPERMAN

For HNG Stage 4, I built SwiftDeploy — a CLI tool that turns a single YAML manifest into a fully running web application with Nginx, Docker containers, and Open Policy Agent security gates.

The problem at hand: Traditional DevOps requires writing multiple config files by hand, manually checking if the environment is safe, and hoping nothing breaks during deployment.

The solution i worked on: One manifest file describes everything. The tool generates all configs, checks policies automatically, and refuses to deploy if conditions aren't met.

How the Manifest Works
The manifest.yaml is the only file I edit. It declares:

yaml
services:
image: swift-deploy-1-node:latest
port: 3000
mode: stable

nginx:
image: nginx:latest
port: 8080

From this single file, swiftdeploy init generates:

nginx.conf with reverse proxy, JSON logging, and error pages

docker-compose.yml with health checks, networks, and volumes

If you delete the generated files, they regenerate exactly the same way. The manifest is the single source of truth, i call the omni truth lol

The Policy Brain: Open Policy Agent

The coolest part is the OPA sidecar. Instead of hardcoding "if disk < 10GB, don't deploy" in the CLI, I wrote it as a Rego policy:

rego
allow if {
disk_free_gb > 10
cpu_load < 2.0
}

The CLI asks OPA: "Should I deploy?" OPA answers with reasoning — not just yes/no, but exactly why. If OPA is unreachable, the CLI fails safely instead of crashing.

The Observability Eyes: Prometheus Metrics

Every request is tracked with counters by method, path, and status code. Latency is recorded in histogram buckets.
The /metrics endpoint serves everything in Prometheus format.

The live dashboard (swiftdeploy status) scrapes these metrics every 3 seconds and shows real-time policy compliance.

Chaos Testing: Breaking Things on Purpose
The canary mode has a /chaos endpoint that lets you inject failures:

slow mode: Responses take N seconds

error mode: 50% of requests return 500 errors

recover: Cancels all chaos

When I activated error mode, the pre-promote gate blocked promotion because the error rate exceeded the 1% threshold. The audit report recorded every violation.

What I Learned?

Declarative configuration is powerful — One file generates an entire stack

Policy-as-code prevents mistakes — OPA catches problems before they reach users

Observability matters — Without metrics, you're deploying blind

Always whitelist your own IP — I learned this the hard way in Stage 3!

Try It Yourself
The project is open source at github.com/icode-py/swiftdeploy.

git clone https://github.com/icode-py/swiftdeploy.git
cd swiftdeploy
pip install pyyaml jinja2 psutil
cd app && docker build -t swift-deploy-1-node:latest . && cd ..
python swiftdeploy deploy
curl http://localhost:8080/

Thank you

DEV Community

Building a Self-Deploying Infrastructure Tool with OPA Policy Guards

Top comments (0)