DEV Community

Cover image for I Read the Devenex Launch Yesterday - Here's the Policy File Your Agent Repo Is Still Missing
Mykola Kondratiuk
Mykola Kondratiuk

Posted on

I Read the Devenex Launch Yesterday - Here's the Policy File Your Agent Repo Is Still Missing

I spent an hour reading the Devenex launch yesterday and the only sentence I keep coming back to is "execution control plane." That phrase is doing a lot of work.

It says: enforcement is a product now. Every agent request gets policy-evaluated, identity-bound, recorded as evidence before anything runs.

It does not say: the policy itself exists.

Six products ship enforcement. Zero ship the policy.

Look at what shipped this month. Devenex launched May 19 as the first execution control plane. Antigravity 2.0 hardened Git policies at Google I/O Day 2. Notion's External Agent API went GA with workspace-scoped guardrails. Claude has had tool-use limits since launch. OpenAI has function-call constraints. Salesforce Agentforce has action approvals.

Six products. Different vendors. Different layers. All shipping enforcement.

The artifact they all need to enforce against is the same shape. None of them ship it. That artifact is your problem, and it lives in your repo, not theirs.

I started calling it the policy file.

What goes in the policy file

Four sections. I've been writing it this way for a while; the launches this week made me realize it's the same shape across every enforcement product I read the docs for. The shape doesn't depend on the vendor.

Action classes

The agent's universe of possible actions, broken into named classes: read, write, send-external, transact, escalate, spawn-subagent. Each class is a category the policy file attaches constraints to. The act of writing the list is the point. The default in every deployment doc I've seen is implicit: the agent can do anything inside its tool set. Naming classes is how you refuse that default.

A sketch in YAML:

action_classes:
  read:
    sources: [crm.contacts, crm.opportunities]
  write:
    targets: [crm.opportunities.notes]
  send_external:
    channels: [email, slack-dm]
  transact:
    instruments: [stripe.refund]
Enter fullscreen mode Exit fullscreen mode

That's not a real schema. It's the shape your real schema settles into after the third review.

Blast radius caps

A number per class. Not a vague guardrail, a number the enforcement layer can compare against at request time.

caps:
  write.records_per_run: 50
  send_external.recipients_per_session: 10
  transact.usd_per_run: 500
  spawn_subagent.depth: 2
Enter fullscreen mode Exit fullscreen mode

The contrast: the deployment doc says "the agent has access to the CRM." The policy file says "the agent's write class is capped at fifty records per run." One sentence Devenex can check. One sentence Antigravity can check. One sentence Claude tool-use can check.

Escalation triggers

The inverse half of the allowlist. When the agent hits a class not in its policy, or a cap it's about to exceed, what fires? Named human. Named channel. Named SLA.

escalation:
  - class: write
    trigger: cap_exceeded
    route: "#agent-ops"
    owner: "@owner-of-record"
    sla_hours: 4
  - class: transact
    trigger: any
    route: "#finance-approvals"
    owner: "@treasury-lead"
    sla_hours: 1
Enter fullscreen mode Exit fullscreen mode

The deployment doc has "agent owner" once on page one. The policy file has an escalation route per class.

Evidence schema

What the agent has to log so a human can audit the run afterward. Structured output. The action class invoked. The tool calls. The identity the agent acted as. The policy version. The escalation path if any.

evidence:
  required_fields:
    - run_id
    - policy_version
    - action_class
    - tool_calls
    - acting_identity
    - escalation_record
  format: jsonl
  retention_days: 365
Enter fullscreen mode Exit fullscreen mode

Without an evidence schema, you can't answer "did the agent follow the policy?" after the fact. The policy was unenforceable from the start.

A specific moment that made this concrete

I was reading through a deployment doc for an agent recently. Clean prose. Listed the APIs. Listed the data sources. Useful agent.

No section for what happens when it tries to write five thousand records. No section for what happens when it tries to send to two hundred recipients. No section for what happens when it transacts above a cap, because nobody had written the cap.

The deployment doc wasn't wrong. It was answering the wrong question. It answered "what does the agent do?" The policy file answers "what is the agent allowed to do, and what fires if any of that breaks?"

Different artifact. Different reviewer. Different file.

The clean split: enforcement vs. authoring

Devenex et al. ship enforcement. That half is done. The other half - authoring - isn't a product, and I don't think it can be one. Authoring is the codification of your team's actual judgment about what the agent should be allowed to do. That judgment is cross-functional: engineering knows the runtime, security knows the threat model, legal knows the constraint, finance knows the cap.

It's not "PM lobs a doc over the wall." The PM convenes the call, drafts the file, opens the PR. Engineering reviews it the same way it reviews a Terraform plan. Security reviews it the same way it reviews IAM. The policy ships in the same PR as the agent.

That's policy-as-code, the shape devs already know from infra. The new thing isn't the shape; it's the artifact existing for AI agents at all.

What I'd do this week if I were shipping an agent

Open a policy.yaml in the agent repo. Stub the four sections. Pin one number per class even if it's a wild guess. Wire the evidence schema into the agent's logging path. Put it in the same PR as the next prompt change.

The enforcement layer your platform vendor ships is checking against something. If nobody wrote the something, the enforcement is checking against silence.

What's the section your agent repo is missing first - blast radius caps, or the evidence schema?

Top comments (1)

Collapse
 
itskondrat profile image
Mykola Kondratiuk

honestly, the cleanest pushback on this piece is that "policy as code" works for the boring action classes (read, write, send) but completely punts on the spawn-subagent dimension, which is where the actual loss-of-control risk lives. capping subagent depth at 2 is a number i picked because i didn’t have a better one. anyone wired this up against a real recursion budget yet?