Here's why I built it, how it works, and what I learned.
The Problem: AI Agents Are Powerful but Terrifying
I've been obsessed with AI agents - not chatbots, but agents that actually do things. Agents that can:
- Merge pull requests
- Deploy to Kubernetes
- Update database records
- Send Slack messages on your behalf
The technology is ready. But every time I tried to deploy one to production, the same thing happened:
Security said no.
And honestly? They were right.
Think about it: you're giving an AI the ability to write to production systems, and there's no audit trail, no approval workflow, no way to enforce policies. It's like giving an intern root access and hoping for the best.
I kept seeing teams stuck in what I call "PoC Purgatory" - amazing demos that never ship because there's no governance story.
The Solution: Policy-Before-Dispatch
What if every AI action had to pass through a policy check before it executed?
That's the core idea behind Cordum.
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ AI Agent │ --> │ Safety Kernel │ --> │ Action │
└─────────────┘ └──────────────┘ └─────────────┘
│
┌──────┴──────┐
│ Policy │
│ (as code) │
└─────────────┘
Before ANY job executes, the Safety Kernel evaluates your policy and returns one of:
- ✅ Allow - proceed normally
- ❌ Deny - block with reason
- 👤 Require Approval - human in the loop
- ⏳ Throttle - rate limit
Show Me the Code
Here's what a policy looks like:
# policy.yaml
rules:
- id: require-approval-for-prod
match:
risk_tags: [prod, write]
decision: require_approval
reason: "Production writes need human approval"
- id: block-destructive
match:
capabilities: [delete, drop, destroy]
decision: deny
reason: "Destructive operations not allowed"
- id: allow-read-only
match:
risk_tags: [read]
decision: allow
When an agent tries to do something dangerous, Cordum intervenes:
{
"job_id": "job_abc123",
"decision": "require_approval",
"reason": "Production writes need human approval",
"matched_rule": "require-approval-for-prod"
}
The job waits until a human approves it in the dashboard. Full audit trail. Compliance happy.
Architecture
Cordum is a control plane, not an agent framework. It orchestrates and governs agents - it doesn't replace LangChain or CrewAI.
┌─────────────────────────────────────────────────────────┐
│ Cordum Control Plane │
├─────────────────────────────────────────────────────────┤
│ ┌───────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Scheduler │ │ Safety Kernel │ │ Workflow Engine │ │
│ └───────────┘ └──────────────┘ └─────────────────┘ │
├─────────────────────────────────────────────────────────┤
│ ┌───────────────┐ ┌───────────────────────────────┐ │
│ │ NATS Bus │ │ Redis (State) │ │
│ └───────────────┘ └───────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌───┴────┐
│ Worker │ │ Worker │ │ Worker │
│ (Slack) │ │ (GitHub)│ │ (K8s) │
└─────────┘ └─────────┘ └────────┘
Tech stack:
- Go - Core control plane (~15K lines)
- NATS JetStream - Message bus with at-least-once delivery
- Redis - State store for jobs, workflows, context
- React - Dashboard with real-time updates
Performance:
- < 5ms policy evaluation latency
- 10k+ events/sec per node
- 100% deterministic replay
The Protocol: CAP
I also built a protocol called CAP (Cordum Agent Protocol). Think of it as MCP (Model Context Protocol) but for distributed orchestration.
MCP is great for tool calling within a single model. But it doesn't cover:
- Scheduling across worker pools
- Policy enforcement
- State machine (pending → running → succeeded)
- Heartbeats and worker liveness
CAP fills those gaps. It's a separate repo with SDKs for Go, Python, Node, and C++.
// Hello World worker in Go
worker := cordum.NewWorker(cordum.Config{
Pool: "my-workers",
Subjects: []string{"job.hello.*"},
})
worker.Handle("job.hello.greet", func(ctx cordum.JobContext) error {
name := ctx.Input["name"].(string)
return ctx.Succeed(map[string]any{
"message": fmt.Sprintf("Hello, %s!", name),
})
})
worker.Run()
Pre-Built Packs
Nobody wants to write integrations from scratch. Cordum comes with 16 pre-built packs:
| Category | Packs |
|---|---|
| Communication | Slack, MS Teams, Webhooks |
| DevOps | GitHub, GitLab, Jira |
| Infrastructure | Kubernetes, Terraform, Vault |
| Monitoring | Prometheus, Sentry, OpenTelemetry |
| AI/LLM | MCP Bridge, MCP Client |
Install with one command:
cordumctl pack install slack
Quick Start
Want to try it? Here's the 60-second version:
git clone https://github.com/cordum-io/cordum
cd cordum
docker compose up -d
Open http://localhost:8082 - that's your dashboard.
What I Learned Building This
1. Safety as a feature, not a constraint
I initially thought of governance as a "necessary evil" - something enterprises need for compliance. But I've come to see it as a feature.
When you can prove that every AI action was evaluated against policy and logged, you unlock use cases that were previously impossible. Banks can use AI agents. Healthcare can use AI agents. The "permission to write" becomes a competitive advantage.
2. The protocol matters more than I expected
I spent a lot of time on CAP, and it paid off. Having a clean protocol means:
- Workers can be written in any language
- The control plane can evolve independently
- Third parties can build compatible tools
3. Open source is a distribution strategy
I could have built this as a closed SaaS from day one. But open source:
- Builds trust (you can read the code)
- Enables self-hosting (enterprises love this)
- Creates a community funnel
The business model is open core: self-hosted is free forever, cloud/enterprise features are paid.
What's Next
The roadmap includes:
- Helm chart for Kubernetes deployment
- Cordum Cloud - managed version
- Visual workflow editor in the dashboard
- More packs - AWS, GCP, PagerDuty, etc.
Try It Out
- 🌐 Website: https://cordum.io
- 📦 GitHub: https://github.com/cordum-io/cordum
- 📋 Protocol: https://github.com/cordum-io/cap
- 📚 Docs: https://cordum.io/docs
If you're building AI agents and want governance built in, give it a try. Star the repo if you find it useful ⭐
I'd love feedback - what's missing? What would make this useful for your projects?
Thanks for reading! I'm happy to answer questions in the comments.
Top comments (1)
guys really appreciate feedback.