DEV Community: Navayuvan SB

Three Layers of Tool Call Hardening for AI Agents

Navayuvan SB — Mon, 11 May 2026 16:41:58 +0000

In current software engineering,We're building a lot of AI Agents on our products right now. And having an AI agent in your product is how you keep your product alive, right? That's how the world is moving.

And while everyone is busy building AI agents — tweaking prompts, giving tool calls, focusing on model choice and parameters — there is one critical area most developers sometimes skip.

Tool harness and security.

Not the prompt. Not the model. The harness around your tools — how you design them, constrain them, and control what the agent can actually do with them.

And skipping this will cost you a lot in terms of both security and reliability.

What Even Is Tool Harness?

When you give an AI agent a tool, you're not just giving it a function. You're giving it a boundary. A set of rules about what it can touch, what it can't, and how it should behave when it acts.

Most of us don't think about it that way. We write the tool, attach it to the agent, and move on. The harness — the constraints, the access controls, the behavioral guardrails — gets left to the prompt.

That's the mistake.

Prompts can be overridden. Prompts can be manipulated. Prompts can be ignored. The harness needs to live at the code level, the execution level, the architecture level.

And here's how to build it properly. There are three layers.

Layer 1: Strip Identity Params — Inject Them Server-Side

The first layer is about access control. And it starts with your tool schema.

Let's say you're building a to-do app with an AI agent. You give it a list_tasks tool. Your schema looks like this:

{
  "name": "list_tasks",
  "parameters": {
    "user_id": "string",
    "filters": {
      "status": "string",
      "due_before": "string"
    }
  }
}

Looks fine, right?

It's not.

Because user_id is in the schema, the agent can pass any user ID it wants. A malicious prompt, a confused model, a prompt injection — any of these could have your agent fetching data it has absolutely no business touching. There's no authentication. There's no authorization.

The fix: strip all identity params from the schema. Things like user_id, account_id, workspace_id, knowledge_base_id — these define the scope of who sees what. The agent doesn't get to decide scope. You do.

{
  "name": "list_tasks",
  "parameters": {
    "filters": {
      "status": "string",
      "due_before": "string"
    }
  }
}

And when the tool executes, inject the identity yourself — from the authenticated session:

async function list_tasks(params: { filters: Filters }, session: Session) {
  const userId = session.userId; // you control this, not the agent
  return db.tasks.findMany({
    where: {
      userId,
      ...params.filters,
    },
  });
}

The agent says what it needs. You decide whose data gets touched. That's the harness. 💡

Layer 2: Enforce Behavioral Constraints at the Code Level

The second layer is about how your tools behave — not just what they can access.

If you've used Claude Code, you'd have seen this error:

"A file cannot be written before it has been read."

That's not a prompt instruction. That's a hard constraint baked into the tool itself. The developers at Claude Code took a very human behavior — open the file, read it, understand it, then edit it — and enforced it at the execution level.

That's exactly what we need to do with our own tools.

For example, if you have an update_task tool, don't let the agent call it cold. Enforce a read-first constraint at the code level:

async function update_task(params: UpdateTaskParams, session: Session) {
  const lastRead = await cache.get(`task_read:${params.task_id}:${session.userId}`);

  if (!lastRead || Date.now() - lastRead > 60_000) {
    throw new Error(
      "Task must be read before it can be updated. Call get_task first."
    );
  }

  return db.tasks.update({
    where: { id: params.task_id },
    data: params.updates,
  });
}

You can mention this in the prompt too — but the check has to live in code. Not just in a system prompt the model might miss or ignore. The execution layer is where the harness lives. 🔒

Layer 3: Pre-flight Validation with a Reasoning Agent

This one is more advanced. I haven't shipped it in my own product yet — but I know this will work.

The idea: before any tool call executes, require the agent to pass a reason — a short explanation of why it's calling that tool.

{
  "name": "list_tasks",
  "parameters": {
    "reason": "string",
    "filters": {
      "status": "string",
      "due_before": "string"
    }
  }
}

This forces the agent to think before it acts. It might actually realize the reason isn't valid and decide not to call the tool at all.

And you can take it further — spin up a lightweight validation agent running on a small, fast model that takes the tool name, the reason, and the conversation context, and decides whether the call is actually justified:

async function validateToolCall(toolName: string, reason: string, context: string) {
  const response = await llm.complete({
    model: "fast-small-model",
    prompt: `
      Tool requested: ${toolName}
      Reason given: ${reason}
      Conversation context: ${context}

      Is this tool call justified? Reply YES or NO with a brief explanation.
    `,
  });

  return response.text.startsWith("YES");
}

If the validation agent says no — the tool doesn't run.

This catches hallucinated tool calls, prompt injection attempts, and cases where the agent is just calling tools out of habit rather than necessity. 🛡️

Wrapping Up

We are designing so many agents today. And we're doing it fast. But the harness — the security, the constraints, the access controls — is getting left behind.

At the very least, we should be sure we're not giving an agent access to something it shouldn't have. That the tools we build have opinions about how they get used. That there are guardrails that exist at the architecture level, not just in a prompt.

Strip the identity params. Enforce behavioral constraints in code. Add a reasoning checkpoint before execution.

These three layers won't just make your agent more secure. They'll make it more reliable, more predictable, and way easier to debug when something goes wrong.

And trust me — something will go wrong. The question is whether your harness was ready for it.

Hope you liked the read, follow me on my socials for more tech content. See you in the next blog 👋🏻

I Reverse Engineered Claude's UI Widget — And It Changed How I Think About Building LLM Apps

Navayuvan SB — Fri, 24 Apr 2026 09:03:05 +0000

So we've all seen Anthropic ship features at an incredible pace, right? And the easy assumption is — ah, they probably have Mythos, some model more powerful than what's publicly available, and they're using that internally to move fast.

But that's not the only reason. And honestly, it's not even the most interesting one.

About three months back, I started using Claude as my primary assistant for pretty much everything. And I noticed something that genuinely caught my attention.

When I ask Claude something simple, it responds in plain text. But when the answer is complex, or when there's a lot of information to show — it renders a UI right inside the Claude app. An interactive widget I can actually play with. Not just text. A real interface.

I started wondering — how are they doing this? 🤔

My First Guess Was Wrong

My initial assumption was that Anthropic had built a library of React components, given the LLM instructions on when and how to use each one, and when Claude responds, it generates a JSON payload that the frontend maps to those components.

That seemed reasonable to me.

I was completely wrong.

So I opened Claude on the web, pulled up the network tab, and inspected the actual response. I reverse engineered how Claude renders its UI.

What I found was surprising. 👀

What's Actually Happening

The response wasn't JSON. It wasn't a reference to any predefined component.

It was a plain HTML, CSS, and JavaScript file — with inline styles.

That's when it clicked. They're not using a component library to build the UI. They went a level below that. They provided Claude with a design system — the design principles, the basic styling rules, how a button should look and behave — and then asked Claude to generate HTML, CSS, and JavaScript on its own.

They take that single HTML file and render it in an iframe.

"They didn't build the UI. They taught Claude how to build it."

Think about what this means. LLMs write good code. Anthropic gave Claude a design system and said — generate the UI. And as the model gets smarter, the UI it generates gets better. Automatically. Without changing a single line of their own code. 🚀

Splitting the Hands from the Brain

I later came across a blog post from Anthropic that described this concept — splitting the hands from the brain.

The idea is this: most developers write prompts and instructions that are tightly coupled to a specific model. If a model doesn't do something well, you go in and patch the prompt. You hardcode workarounds. You over-instruct.

What Anthropic is doing instead is providing raw tools to the LLM and letting the model figure out how to use them.

"The instructions stay the same. The model just gets better at using them."

So if you're using Claude Sonnet 4.6, the UI it generates is solid. Move to Opus, it gets significantly better. Move to Mythos — it's on another level entirely. And Anthropic didn't have to touch their instructions. The model just got better at using the same tools.

That's the key insight. 💡

Why This Should Change How We Build LLM Apps

We have access to the same models Anthropic is using. But what are most of us doing? We're hardcoding logic into prompts. We're writing harness that's tightly coupled to a specific model's behavior. And the moment a smarter model ships, that harness becomes stale.

We should stop encoding specific instructions into prompts and start thinking about building better tools with clearer interfaces — tools that any LLM, today or two years from now, can pick up and use effectively.

"Stop writing instructions for the model. Start building tools for it."

It's Not as Simple as Writing a Good Prompt

I'll be honest — I used to think building LLM apps was straightforward. Give it a good prompt, tweak it when something breaks, move on.

That's not how it works.

Architecting an agent properly takes real thought. What Anthropic is doing is genuinely different from what most companies are doing right now. We're still treating AI like a rule-following system — developers trying to hardcode intelligence into a prompt instead of letting the model use its own.

Here's a better way to think about it: imagine you're handed a fixed set of components and told to build something. No flexibility, no room to think. You just assemble what's given.

Now imagine instead someone hands you a design system — guidelines, principles, a foundation — and says, make it look great, adapt as needed. Suddenly there's room for judgment. For creativity. For the model to actually do what it's good at.

"Give a model components, and it assembles. Give it a design system, and it creates."

That's what Anthropic figured out. And I think it's worth all of us taking a step back and rethinking how we're building.

Hope you like the read, see you on the next blog!