Agents don't fail like APIs

#ai #agents #api #agentaichallenge

At my last job, I built an agent.

The first thing leadership asked me after the demo went well was a single sentence.

How can I tell what the agent is doing?

I sat there for a second. Then I gave a bad answer. I don’t even remember what it was, something about logs and tracing, but I knew while I was saying it that it didn’t actually answer the question. Because the question wasn’t hard for the reason I was answering. It was hard because an agent isn’t really your code anymore.

An agent is inferencing. It’s not an if-else statement. It’s not computer logic as we know it. It acts more like a human.

Once you see it that way, the rest of the questions show up.

How do I stop it if it’s misbehaving? What is it allowed to touch? What isn’t it?

You don’t ask those of a function. You ask those of an employee.

I started Checkrd because none of those questions had a good answer on the market.

Why “we’ll just tell the model not to” doesn’t hold up

The reply I get most often when I ask how someone’s preventing their agent from doing something is: we put it in the prompt.

That works. Until it doesn’t. And the thing about prompts is, they don’t fail in a useful way. There’s no exception. There’s no return code. The agent just… did the thing it wasn’t supposed to do, and now you’re filing an incident.

Model alignment is the wrong layer for business rules. The model decides what to generate. It does not decide whether the request your agent is about to make to a third-party API, three hops downstream, with a payload your agent built from a SQL row and a customer email, is one you actually wanted made. By the time you’re there, the model is way upstream.

The other place people try to enforce things is the agent framework. LangChain, MCP, the OpenAI Agents SDK. Those give you hooks and tracing. Hooks observe. They don’t really block, not in the way a CISO needs.

What’s missing is a layer between the agent and the network. Something that sees every outbound HTTP request before it goes out, runs it against a policy you wrote, and stops it if the policy says stop.

The fintech version of the question

I had a call with a fintech founder a few months in who put it more bluntly than anyone else.

He wanted his agent handling transactions. Refunds, holds, the stuff that ends up in compliance review. He’d built something good. He wouldn’t ship it.

His framing, roughly:

You’re not putting a thing that infers in the path of money. Not without something that fails closed when it tries to do anything you didn’t authorize.

That’s not a model problem. It’s not a tracing problem. It’s authorization. He needed something that said: the agent can read these endpoints; it can write to these endpoints with these specific constraints; it cannot ever, under any prompt, do this third class of thing. And if it tries, the request never leaves the box.

We built that.

What Checkrd is

A small library you load into your application. Every API call your agent makes (to OpenAI, to Stripe, to whatever) runs through it before it touches the network. You write a YAML policy describing what you want. The library evaluates it on every call.

Things you can write:

“This agent can call Salesforce GET endpoints; it cannot call DELETE.”
“No agent can hit this endpoint more than 100 times a minute.”
“If a request body matches a credit card pattern, refuse it.”

When something goes sideways, you flip one switch and every agent in your fleet stops within a second. That’s the first of the two questions: how do I stop it?

Every allow / deny / throttle decision is logged. If a regulator asks what an agent called inside a three-minute window last Tuesday, you answer in one query. The schema literally doesn’t have fields for prompts or response bodies. That data never leaves your application.

The engine is Apache 2.0. Your security team can read every line before they sign off on running it inside their process.

A note on the rest of the stack

People ask how Checkrd fits with the tools they already have.

Observability (Langfuse, Helicone, LangSmith) is the right shape for prompt iteration and evals. They observe. We enforce. Most teams who use one end up using both.

AI gateways (Cloudflare AI Gateway, Portkey) handle provider routing, caching, edge rate-limiting. Different shape. Run both if you need both.

Agent frameworks (LangChain, MCP, OpenAI Agents SDK, Vercel AI SDK, Mastra) all have a Checkrd integration. The framework helps you build the agent. We govern what the agent can do once you’ve built it.

What I’d push back on, again: “we’ll just put it in the prompt.” Prompts are probabilistic. Compliance isn’t.

Where this goes

We built Checkrd for fintech and healthtech first because they have the version of these questions that doesn’t have a workaround. Agents that touch money or PHI. Customers who do real diligence. Auditors who want real answers. CISOs who say “no, you can’t ship this until you can show me what it did.”

If you’re building toward production and any of this resonates, I’d love to hear about your stack. Checkrd is open source (the SDKs and the policy engine they wrap) at checkrd.io. The quickstart is a five-minute install. If you’re in fintech or healthtech and want to talk specifics, my email’s on the about page.