yaron torjeman

Posted on Feb 8

Every AI agent framework focuses on making agents smarter. None of them ask what happens when agents screw up.

#webdev #programming #ai #devops

New agent framework drops every week. CrewAI. LangChain. AutoGen. OpenAI Agents SDK.

They all compete on the same axis: how smart can we make the agent?

Nobody's competing on: how do we stop it from deleting prod?

I work in DevOps. I manage Kubernetes clusters, CI/CD pipelines, and cloud infrastructure for a living. I've watched teams build incredible AI agent demos that can manage infrastructure, write code, push deployments.

Then the security review happens.

"So this thing can just… run kubectl delete namespace production?"

"Well technically"

"No."

Demo stays in staging. Forever. I've seen this three times this year alone.

The gap nobody's filling

Here's the thing. Airflow, Temporal, n8n — they're great at running stuff. But they don't care what they're running. Safety is your problem.

Agent frameworks? They care about reasoning, tool selection, memory. They don't care what happens when the reasoning is wrong.

There's a gap between "the agent decided to do X" and "X actually happened in production." Nobody owns that gap.

So I built something to own it.

Cordum

Cordum is an open-source control plane that sits between AI agents and infrastructure. Every intent gets intercepted and evaluated before execution.

Agent: "I want to run kubectl delete namespace production"
    ↓
Cordum Safety Kernel: evaluates against policy
    ↓
Decision: DENY — "Destructive operations on production are not allowed"
    ↓
Result: Command never reaches your cluster.

That's the entire idea. Policy-as-code, evaluated pre-dispatch, with a full audit trail.

What it looks like in practice

You write policies in YAML:

rules:
  - id: block-destructive-prod
    match:
      risk_tags: [destructive, production]
    decision: deny
    reason: "Destructive prod operations blocked"

  - id: approve-prod-writes
    match:
      risk_tags: [production, write]
    decision: require_approval
    reason: "Production writes need human sign-off"

Four possible outcomes:

Decision	What happens
`allow`	Action proceeds, fully logged
`deny`	Blocked. Never reaches infrastructure
`require_approval`	Human gets pinged. Action waits
`constrain`	Allowed, but with enforced limits

The kernel evaluates every action in under 5ms. Not a sidecar. Not a webhook. It's in the execution path.

Architecture

┌─────────────────────────────────────────┐
│  Agent (Claude, GPT, CrewAI, whatever)  │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────┴───────────────────────┐
│  Cordum Control Plane                   │
│  ┌──────────────────────────────────┐   │
│  │  Safety Kernel (policy gate)     │   │
│  │  → allow / deny / approve / cap  │   │
│  └──────────────────────────────────┘   │
│  ┌──────────────────────────────────┐   │
│  │  Scheduler + Workflow Engine     │   │
│  └──────────────────────────────────┘   │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────┴───────────────────────┐
│  Infrastructure (K8s, AWS, GitHub, etc) │
└─────────────────────────────────────────┘

Built in Go. NATS JetStream for durable messaging. Redis for state. Not a Python wrapper around an LLM API call.

What Cordum is NOT

This matters:

It's not an agent framework. It doesn't replace CrewAI or LangChain. It governs them. Use whatever agent framework you want — Cordum sits underneath.

It's not a workflow engine. Airflow runs DAGs. Cordum decides whether a step in your DAG is allowed to run.

It's not post-hoc logging. By the time you're reading logs, the damage is done. Cordum blocks bad actions before they execute.

The pack system

Want to add capabilities? Install a pack:

cordum pack install slack
cordum pack install kubernetes
cordum pack install github

16 packs available. Each ships as a signed OCI container. Installs with policy overlays — so new capabilities come with governance built in.

Rough edges

It's v0.1.0. I'm not going to pretend otherwise.

The docs need work. Some error messages are cryptic. The dashboard is functional, not pretty. I'm one developer building this between my day job and too much coffee.

But the core — the Safety Kernel, the CAP protocol, the policy engine — that works. It's in production. The architecture decisions are ones I'd make again.

Why I think this matters

We're in a weird moment. Everyone's racing to give agents more autonomy. More tool access. More decision-making power. And the governance story is basically: "we'll figure it out later."

Later is a kubectl delete away from being too late.

⭐ GitHub: github.com/cordum-io/cordum
🌐 Website: cordum.io
📖 Docs: cordum.io/docs

If you've ever had an agent demo killed by a security review — I built this for you.

What's your team's approach to agent governance? Genuinely curious. I keep hearing "we just don't deploy them to prod" and that can't be the final answer.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.