Andrej

Posted on Apr 7 • Edited on Apr 11

Four Write Tools, Zero Confirmation, What Could Go Wrong

#agents #llm #ai #architecture

Agent Internals -- Part 2

So, in the first part we split one big agent into multiple specialist agents and set up model routing. It works but it's very far from anything that you would use in a prodcution system.

This post covers the confirmation gate I (read me and llm) built to fix that: a pending action system that intercepts writes, asks the user, and only executes on explicit approval.

The Problem

The agentic loop from Part 1 calls tools automatically. Claude decides to call create_contact, the loop executes it, the contact exists in your CRM. There's no undo.

This is fine for reads. It's not fine for writes, for two reasons:

Claude hallucinates parameters. "Create a contact for Maria" might become create_contact({ name: "Maria", email: "maria@company.com" }) -- where did that email come from? Claude inferred it. Confidently.
Intent is ambiguous. "I should probably log a call with John" -- is that a request or thinking out loud? The specialist doesn't know. It has log_activity in its tool set, so it uses it.

In both cases, the human needs to see what's about to happen before it happens.

The Design

The confirmation gate sits between the specialist's tool call and the CRM API:

Specialist calls create_contact(...)
     |
     v
executeToolWithConfirmation()
     |
     +-- read tool? -> execute immediately
     +-- write tool? -> save to DB, return "pending_confirmation"
                             |
                             v
                        User sees: "Create contact Maria Garcia -- reply yes to confirm"
                             |
                             +-- "yes" -> execute
                             +-- "no"  -> cancel
                             +-- anything else -> cancel + process new message

Three principles:

Write tools require confirmation. Read tools don't. Searching contacts is harmless. Creating one is not.
One pending action per channel. No queue. A new write replaces any pending one.
Pending actions expire. 5 minutes. If the user walks away, nothing happens.

The Interception Point

Four of thirteen tools are writes, tracked in an explicit set (not a naming convention -- future tools must be added deliberately):

export const WRITE_TOOLS = new Set([
  "create_contact",
  "create_deal",
  "create_task",
  "log_activity",
]);

executeToolWithConfirmation wraps the normal executeTool. If the tool is a write and a confirmation context exists, it saves the action and returns a status instead of calling the API:

export async function executeToolWithConfirmation(
  name: string,
  input: ToolInput,
  crm: CrmApiClient,
  confirmation?: ConfirmationContext,
): Promise<string> {
  if (confirmation && WRITE_TOOLS.has(name)) {
    const description = buildActionDescription(name, input);
    await savePendingAction(
      confirmation.channelId,
      name,
      input,
      confirmation.crmApiKey,
      description,
    );
    return JSON.stringify({
      status: "pending_confirmation",
      message: `This action requires confirmation: ${description}`,
    });
  }
  return executeTool(name, input, crm);
}

From Claude's perspective, the tool "succeeded" -- it returned a result. That result just happens to say "pending_confirmation" instead of containing CRM data. The specialist sees this and tells the user what's about to happen.

A buildActionDescription function turns tool calls into something a human can verify -- the user sees "Create contact Maria Garcia, maria@acme.com" instead of raw JSON.

Pending Action Storage

Pending actions live in PostgreSQL, not in memory -- because the server might restart, and because multiple messages might arrive between the action being proposed and confirmed.

One Per Channel

The save uses ON CONFLICT (channel_id) DO UPDATE:

INSERT INTO pending_actions
  (channel_id, tool_name, tool_input, crm_api_key, description, expires_at)
VALUES ($1, $2, $3::jsonb, $4, $5, NOW() + INTERVAL '5 minutes')
ON CONFLICT (channel_id) DO UPDATE SET
  tool_name = EXCLUDED.tool_name,
  tool_input = EXCLUDED.tool_input,
  crm_api_key = EXCLUDED.crm_api_key,
  description = EXCLUDED.description,
  created_at = NOW(),
  expires_at = NOW() + INTERVAL '5 minutes'

Why not a queue? Because the conversation is sequential. Queueing multiple pending actions would mean asking "confirm action 1? action 2? action 3?" -- terrible UX for a chat interface. If the specialist produces a second write before the first is confirmed, the second replaces the first. The user's latest request is what matters.

5-Minute Expiry

The getPendingAction query filters on expires_at > NOW(). Expired actions are invisible. Long enough to read the confirmation and type "yes." Short enough that you won't accidentally confirm something you asked about an hour ago.

The Confirmation Flow

The message handler checks for a pending action before doing anything else:

User says	What happens
"yes" / "y"	Execute the tool, save to session, send result
"no" / "n"	Delete pending action, send "Cancelled."
anything else	Delete pending action, notify, process the new message normally

The third branch matters. If the user has a pending create_contact and sends "actually, show me my pipeline" -- the pending action is cancelled and the pipeline query runs. The user isn't trapped in a confirm/deny loop.

Notice the order: deletePendingAction runs before the confirmation check. The action is deleted regardless of what the user says, preventing a race condition where a network retry could execute the same action twice. If confirmation succeeds, the action executes from the in-memory object already loaded.

Evaluator Integration

The evaluator from Part 1 has a problem with confirmation gates. When the specialist returns "I'd like to create contact Maria Garcia -- please confirm," that's technically not answering the question -- the contact hasn't been created. The evaluator would fail it and trigger a retry, creating an infinite loop.

The fix: after the specialist runs, the orchestrator checks whether a pending action was created. If so, it skips evaluation entirely. This is a targeted exception -- if the specialist responds to a write request without triggering a confirmation (e.g., it explains why it can't create the contact), evaluation runs normally.

What the User Sees

User: Create a contact for Maria Garcia at Acme Corp, maria@acme.com

Agent: I'll create a new contact with these details:
       - Name: Maria Garcia
       - Email: maria@acme.com
       - Company: Acme Corp

       Create contact Maria Garcia, maria@acme.com, Acme Corp
       -- reply "yes" to confirm or "no" to cancel.

User: yes

Agent: Done! Create contact Maria Garcia, maria@acme.com, Acme Corp.

The specialist writes the natural language explanation. The handler appends the mechanical confirmation prompt. If we switch from "reply yes/no" to inline buttons (Telegram supports them), only the handler changes.

Security: Fail-Closed

The quality evaluator from Part 1 is fail-open -- if it breaks, responses pass through. The confirmation gate is the opposite: fail-closed.

If savePendingAction throws, the specialist loop aborts and the user gets an error message. No write reaches the CRM.

Gate	Failure mode	Why
Quality evaluator	Fail-open	A mediocre response beats no response
Confirmation gate	Fail-closed	An unintended write has real consequences. Block on error.

The asymmetry is intentional. Quality is a nice-to-have. Data integrity is not.

What This Gets Right

No accidental writes Every CRM mutation requires explicit human approval. Claude can hallucinate parameters all it wants -- the user sees exactly what's about to be created before it happens.
Minimal disruption Read tools are unaffected. The confirmation gate is invisible for 9 of 13 tools.
Graceful interruption Users aren't locked into confirm/deny. Any unrelated message cancels the pending action and continues normally.

What This Doesn't Solve (Yet)

Batch confirmations "Create contacts for all five people I mentioned" triggers one confirmation per contact. That's five yes/no rounds.
Undo Confirmation prevents bad writes. It doesn't help after a confirmed write turns out to be wrong.

Human input is still needed to check for hallucinations. This pattern generalizes beyond CRM meaning that any agent should have a human checkpoint for mutation operations.
Stay tuned for part three which will focus on MCP and a new feature for the agent - still don't know which one so be sure to check it out. See you later!

DEV Community