Building what NIST is asking for: an AI agent security proxy

#nist #owasp #security #devsecops

NIST just closed a public RFI on AI agent security. The question they were asking, in five different ways: how do you constrain what an AI agent can do, and how do you prove it was constrained?
We built something that answers both. Not because we read the RFI — we built it because we ran into the problem first. Reading the RFI afterward was like seeing someone formally describe a thing you've been fixing with duct tape.

The problem frameworks don't solve

Most security frameworks for AI agents focus on what the agent should do: don't call dangerous APIs, don't exfiltrate data, follow least-privilege principles. Good policies. But policies are enforced at configuration time, and AI agents operate at runtime. The gap between "the policy says X" and "the agent did X" is where incidents happen.
The deeper issue: in heterogeneous pipelines, each provider certifies only their own model. AWS certifies Bedrock. OpenAI certifies GPT-4o. Your self-hosted Mistral is self-attested at best. The handoffs between models — where data transforms, where context passes, where decisions compound — are certified by nobody. That's where your audit trail ends.

Three providers. Zero cross-model proof. Most teams find this out when a regulator asks them to replay what happened.

What a certifying proxy does ?

The architecture is straightforward: the agent never calls external APIs directly. Every outbound action passes through a proxy that scopes the call, executes it, and issues a cryptographic receipt — hashed request, hashed response, timestamp, and the identity of the external endpoint.
Integration is a wrapper, not a rewrite. You're adding three fields around calls you're already making:
python

Before

requests.post("https://api.github.com/repos/...", json={...})

After

requests.post("https://trust.arkforge.tech/v1/proxy", json={
    "target": "https://api.github.com/repos/...",
    "method": "POST",
    "payload": {...}
})

Target, method, payload. The proxy format is intentionally minimal.
It's a change at every call site — but what you get in return is a cryptographic receipt for every action, timestamped and anchored by a neutral third party. Not by the agent. Not by the model provider.
That last point matters. An agent cannot certify its own actions any more than a witness can notarize their own testimony.
The signing key belongs to ArkForge, not to the executing agent. That's the difference between a notarized document and a self-signed affidavit.
The key design decision: the proxy is model-agnostic and provider-agnostic. It doesn't care whether the agent is Claude, GPT-4o, or a local Mistral instance. Any agent, any API, one enforcement point.

What the proof trail gives you ?

Every certified action produces a proof_id. That ID links to a page with SHA-256 hashes of request and response, a timestamp verified by a trusted authority (RFC 3161), and a chain hash.

Real example: https://trust.arkforge.tech/v1/proof/prf_20260313_145317_c69de2
The proof spec is open and auditable: https://github.com/ark-forge/proof-spec.
The Trust Layer itself is MIT-licensed: https://github.com/ark-forge/trust-layer.

What this doesn't solve ?

A certifying proxy doesn't prevent an agent from being instructed to do something harmful. It constrains scope, not intent.
It also doesn't solve key management at scale or identity federation across organizations. Those are real problems, and they're next.What it gives you: evidence. Scope without evidence is policy on paper.

Why the NIST RFI matters here ?

NIST's RFI asked specifically how to constrain AI agent deployment environments and how to prove enforcement happened.
It identified cross-agent interactions — pipelines where one model hands off to another — as a distinct and underaddressed risk category.
We built ArkForge because we were running that exact pipeline and had no way to answer "what did the agent actually do, and who can verify it." The RFI formalizes the problem. The Trust Layer is one answer to it.

Question for the room

Are you enforcing scope at the agent level, the API gateway level, or somewhere else?
And do you have a way to prove enforcement happened — or are you relying on logs?

ArkForge Trust Layer is open-source (MIT). Free tier: 500 proofs/month, no card required. GitHub | Live API | Pricing

Try It Free

ArkForge Trust Layer generates cryptographic receipts for every agent action -- verifiable proof that holds up under audit. Open-source (MIT), 500 proofs/month free, no card required.

Get your free API key | GitHub

DEV Community

Building what NIST is asking for: an AI agent security proxy

Before

After

Try It Free

Top comments (0)