<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Navayuvan SB</title>
    <description>The latest articles on DEV Community by Navayuvan SB (@navayuvan).</description>
    <link>https://dev.to/navayuvan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3895669%2F260bb2b8-e6f3-4d7a-9c87-44fb3ec75a73.jpg</url>
      <title>DEV Community: Navayuvan SB</title>
      <link>https://dev.to/navayuvan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/navayuvan"/>
    <language>en</language>
    <item>
      <title>Three Layers of Tool Call Hardening for AI Agents</title>
      <dc:creator>Navayuvan SB</dc:creator>
      <pubDate>Mon, 11 May 2026 16:41:58 +0000</pubDate>
      <link>https://dev.to/navayuvan/three-layers-of-tool-call-hardening-for-ai-agents-3fjc</link>
      <guid>https://dev.to/navayuvan/three-layers-of-tool-call-hardening-for-ai-agents-3fjc</guid>
      <description>&lt;p&gt;In current software engineering,We're building a lot of AI Agents on our products right now. And having an AI agent in your product is how you keep your product alive, right? That's how the world is moving.&lt;/p&gt;

&lt;p&gt;And while everyone is busy building AI agents — tweaking prompts, giving tool calls, focusing on model choice and parameters — there is one critical area most developers sometimes skip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool harness and security.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not the prompt. Not the model. The harness around your tools — how you design them, constrain them, and control what the agent can actually do with them.&lt;/p&gt;

&lt;p&gt;And skipping this will cost you a lot in terms of both security and reliability.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Even Is Tool Harness?
&lt;/h2&gt;

&lt;p&gt;When you give an AI agent a tool, you're not just giving it a function. You're giving it a boundary. A set of rules about what it can touch, what it can't, and how it should behave when it acts.&lt;/p&gt;

&lt;p&gt;Most of us don't think about it that way. We write the tool, attach it to the agent, and move on. The harness — the constraints, the access controls, the behavioral guardrails — gets left to the prompt.&lt;/p&gt;

&lt;p&gt;That's the mistake.&lt;/p&gt;

&lt;p&gt;Prompts can be overridden. Prompts can be manipulated. Prompts can be ignored. The harness needs to live at the code level, the execution level, the architecture level.&lt;/p&gt;

&lt;p&gt;And here's how to build it properly. There are three layers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: Strip Identity Params — Inject Them Server-Side
&lt;/h2&gt;

&lt;p&gt;The first layer is about access control. And it starts with your tool schema.&lt;/p&gt;

&lt;p&gt;Let's say you're building a to-do app with an AI agent. You give it a &lt;code&gt;list_tasks&lt;/code&gt; tool. Your schema looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"list_tasks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"user_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"due_before"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks fine, right?&lt;/p&gt;

&lt;p&gt;It's not.&lt;/p&gt;

&lt;p&gt;Because &lt;code&gt;user_id&lt;/code&gt; is in the schema, the agent can pass &lt;em&gt;any&lt;/em&gt; user ID it wants. A malicious prompt, a confused model, a prompt injection — any of these could have your agent fetching data it has absolutely no business touching. There's no authentication. There's no authorization.&lt;/p&gt;

&lt;p&gt;The fix: strip all identity params from the schema. Things like &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;account_id&lt;/code&gt;, &lt;code&gt;workspace_id&lt;/code&gt;, &lt;code&gt;knowledge_base_id&lt;/code&gt; — these define the scope of who sees what. The agent doesn't get to decide scope. You do.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"list_tasks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"due_before"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And when the tool executes, inject the identity yourself — from the authenticated session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;list_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Filters&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// you control this, not the agent&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent says &lt;em&gt;what&lt;/em&gt; it needs. You decide &lt;em&gt;whose&lt;/em&gt; data gets touched. That's the harness. 💡&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: Enforce Behavioral Constraints at the Code Level
&lt;/h2&gt;

&lt;p&gt;The second layer is about how your tools behave — not just what they can access.&lt;/p&gt;

&lt;p&gt;If you've used Claude Code, you'd have seen this error:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"A file cannot be written before it has been read."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's not a prompt instruction. That's a hard constraint baked into the tool itself. The developers at Claude Code took a very human behavior — open the file, read it, understand it, &lt;em&gt;then&lt;/em&gt; edit it — and enforced it at the execution level.&lt;/p&gt;

&lt;p&gt;That's exactly what we need to do with our own tools.&lt;/p&gt;

&lt;p&gt;For example, if you have an &lt;code&gt;update_task&lt;/code&gt; tool, don't let the agent call it cold. Enforce a read-first constraint at the code level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;update_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;UpdateTaskParams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lastRead&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`task_read:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;lastRead&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;lastRead&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Task must be read before it can be updated. Call get_task first.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;task_id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;updates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can mention this in the prompt too — but the check has to live in code. Not just in a system prompt the model might miss or ignore. The execution layer is where the harness lives. 🔒&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Pre-flight Validation with a Reasoning Agent
&lt;/h2&gt;

&lt;p&gt;This one is more advanced. I haven't shipped it in my own product yet — but I know this will work.&lt;/p&gt;

&lt;p&gt;The idea: before any tool call executes, require the agent to pass a &lt;code&gt;reason&lt;/code&gt; — a short explanation of &lt;em&gt;why&lt;/em&gt; it's calling that tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"list_tasks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"due_before"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This forces the agent to think before it acts. It might actually realize the reason isn't valid and decide not to call the tool at all.&lt;/p&gt;

&lt;p&gt;And you can take it further — spin up a lightweight validation agent running on a small, fast model that takes the tool name, the reason, and the conversation context, and decides whether the call is actually justified:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;validateToolCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;toolName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fast-small-model&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
      Tool requested: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;toolName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
      Reason given: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
      Conversation context: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;

      Is this tool call justified? Reply YES or NO with a brief explanation.
    `&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YES&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the validation agent says no — the tool doesn't run.&lt;/p&gt;

&lt;p&gt;This catches hallucinated tool calls, prompt injection attempts, and cases where the agent is just calling tools out of habit rather than necessity. 🛡️&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;We are designing so many agents today. And we're doing it fast. But the harness — the security, the constraints, the access controls — is getting left behind.&lt;/p&gt;

&lt;p&gt;At the very least, we should be sure we're not giving an agent access to something it shouldn't have. That the tools we build have opinions about how they get used. That there are guardrails that exist at the architecture level, not just in a prompt.&lt;/p&gt;

&lt;p&gt;Strip the identity params. Enforce behavioral constraints in code. Add a reasoning checkpoint before execution.&lt;/p&gt;

&lt;p&gt;These three layers won't just make your agent more secure. They'll make it more reliable, more predictable, and way easier to debug when something goes wrong.&lt;/p&gt;

&lt;p&gt;And trust me — something will go wrong. The question is whether your harness was ready for it.&lt;/p&gt;

&lt;p&gt;Hope you liked the read, follow me on my socials for more tech content. See you in the next blog 👋🏻&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>llm</category>
      <category>software</category>
    </item>
    <item>
      <title>I Reverse Engineered Claude's UI Widget — And It Changed How I Think About Building LLM Apps</title>
      <dc:creator>Navayuvan SB</dc:creator>
      <pubDate>Fri, 24 Apr 2026 09:03:05 +0000</pubDate>
      <link>https://dev.to/navayuvan/i-reverse-engineered-claudes-ui-widget-and-it-changed-how-i-think-about-building-llm-apps-483f</link>
      <guid>https://dev.to/navayuvan/i-reverse-engineered-claudes-ui-widget-and-it-changed-how-i-think-about-building-llm-apps-483f</guid>
      <description>&lt;p&gt;So we've all seen Anthropic ship features at an incredible pace, right? And the easy assumption is — ah, they probably have Mythos, some model more powerful than what's publicly available, and they're using that internally to move fast.&lt;/p&gt;

&lt;p&gt;But that's not the only reason. And honestly, it's not even the most interesting one.&lt;/p&gt;

&lt;p&gt;About three months back, I started using Claude as my primary assistant for pretty much everything. And I noticed something that genuinely caught my attention.&lt;/p&gt;

&lt;p&gt;When I ask Claude something simple, it responds in plain text. But when the answer is complex, or when there's a lot of information to show — it renders a UI right inside the Claude app. An interactive widget I can actually play with. Not just text. A real interface.&lt;/p&gt;

&lt;p&gt;I started wondering — &lt;em&gt;how are they doing this?&lt;/em&gt; 🤔&lt;/p&gt;




&lt;h2&gt;
  
  
  My First Guess Was Wrong
&lt;/h2&gt;

&lt;p&gt;My initial assumption was that Anthropic had built a library of React components, given the LLM instructions on when and how to use each one, and when Claude responds, it generates a JSON payload that the frontend maps to those components.&lt;/p&gt;

&lt;p&gt;That seemed reasonable to me.&lt;/p&gt;

&lt;p&gt;I was completely wrong.&lt;/p&gt;

&lt;p&gt;So I opened Claude on the web, pulled up the network tab, and inspected the actual response. I reverse engineered how Claude renders its UI.&lt;/p&gt;

&lt;p&gt;What I found was surprising. 👀&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Actually Happening
&lt;/h2&gt;

&lt;p&gt;The response wasn't JSON. It wasn't a reference to any predefined component.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It was a plain HTML, CSS, and JavaScript file — with inline styles.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's when it clicked. They're not using a component library to build the UI. They went a level below that. They provided Claude with a &lt;strong&gt;design system&lt;/strong&gt; — the design principles, the basic styling rules, how a button should look and behave — and then asked Claude to generate HTML, CSS, and JavaScript on its own.&lt;/p&gt;

&lt;p&gt;They take that single HTML file and &lt;strong&gt;render it in an iframe&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"They didn't build the UI. They taught Claude how to build it."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Think about what this means. LLMs write good code. Anthropic gave Claude a design system and said — generate the UI. And as the model gets smarter, the UI it generates gets better. Automatically. Without changing a single line of their own code. 🚀&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgti20uzhsr5quzpjx2c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcgti20uzhsr5quzpjx2c.png" alt="make-ai-think-native" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Splitting the Hands from the Brain
&lt;/h2&gt;

&lt;p&gt;I later came across a blog post from Anthropic that described this concept — &lt;em&gt;splitting the hands from the brain&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The idea is this: most developers write prompts and instructions that are tightly coupled to a specific model. If a model doesn't do something well, you go in and patch the prompt. You hardcode workarounds. You over-instruct.&lt;/p&gt;

&lt;p&gt;What Anthropic is doing instead is providing &lt;strong&gt;raw tools&lt;/strong&gt; to the LLM and letting the model figure out how to use them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"The instructions stay the same. The model just gets better at using them."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So if you're using Claude Sonnet 4.6, the UI it generates is solid. Move to Opus, it gets significantly better. Move to Mythos — it's on another level entirely. And Anthropic didn't have to touch their instructions. The model just got better at using the same tools.&lt;/p&gt;

&lt;p&gt;That's the key insight. 💡&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9v49j2nz2fcdkjd7f6lz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9v49j2nz2fcdkjd7f6lz.png" alt="agent-harness" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Should Change How We Build LLM Apps
&lt;/h2&gt;

&lt;p&gt;We have access to the same models Anthropic is using. But what are most of us doing? We're hardcoding logic into prompts. We're writing harness that's tightly coupled to a specific model's behavior. And the moment a smarter model ships, that harness becomes stale.&lt;/p&gt;

&lt;p&gt;We should stop encoding specific instructions into prompts and start thinking about building &lt;strong&gt;better tools with clearer interfaces&lt;/strong&gt; — tools that any LLM, today or two years from now, can pick up and use effectively.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Stop writing instructions for the model. Start building tools for it."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  It's Not as Simple as Writing a Good Prompt
&lt;/h2&gt;

&lt;p&gt;I'll be honest — I used to think building LLM apps was straightforward. Give it a good prompt, tweak it when something breaks, move on.&lt;/p&gt;

&lt;p&gt;That's not how it works.&lt;/p&gt;

&lt;p&gt;Architecting an agent properly takes real thought. What Anthropic is doing is genuinely different from what most companies are doing right now. We're still treating AI like a rule-following system — developers trying to hardcode intelligence into a prompt instead of letting the model use its own.&lt;/p&gt;

&lt;p&gt;Here's a better way to think about it: imagine you're handed a fixed set of components and told to build something. No flexibility, no room to think. You just assemble what's given.&lt;/p&gt;

&lt;p&gt;Now imagine instead someone hands you a &lt;strong&gt;design system&lt;/strong&gt; — guidelines, principles, a foundation — and says, &lt;em&gt;make it look great, adapt as needed&lt;/em&gt;. Suddenly there's room for judgment. For creativity. For the model to actually do what it's good at.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Give a model components, and it assembles. Give it a design system, and it creates."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's what Anthropic figured out. And I think it's worth all of us taking a step back and rethinking how we're building.&lt;/p&gt;

&lt;p&gt;Hope you like the read, see you on the next blog!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>llm</category>
      <category>software</category>
    </item>
  </channel>
</rss>
