I gave my AI agent a boss: a human-approval gate in Slack, over MCP

#ai #slack #mcp #aws

AI agents can now act, not just suggest. They issue refunds, run migrations, message customers. That's powerful — and a little terrifying. "Autonomous" should not mean "unsupervised." The moment an agent can spend money or drop a production table, someone needs to be able to say "wait — not like that."

So I built Aegis: a human-approval control plane for AI agents. Before an agent does anything high-risk, it asks a human for approval in Slack, and waits.

▶️ 90-second demo: https://youtu.be/c1jqPDPo6AU
💻 Code: https://github.com/yama3133/aegis-slack-app

The shape of the idea

The gate has to be decoupled from the agent — I didn't want to wire approval logic into every agent or framework. The Model Context Protocol (MCP) is the perfect seam: Aegis is just an MCP server with three tools, and any agent can adopt it without changing its reasoning.

// src/mcp-server.ts
server.registerTool("request_approval", {
  description:
    "Request human approval in Slack before executing a high-risk action. " +
    "ALWAYS call this before destructive, financial, or externally visible actions.",
  inputSchema: {
    agent: z.string(),
    action: z.string(),
    args: z.record(z.string(), z.unknown()),
    risk: z.enum(["low", "medium", "high", "critical"]),
    reason: z.string().optional(),
  },
}, async (input) => {
  const req = await postApprovalRequest(slack, input, DEFAULT_CHANNEL);
  return text({ request_id: req.id, status: req.status, auto_approved: req.autoApproved });
});

Three tools in total: request_approval, check_approval, and wait_for_approval. The agent calls request_approval before a risky action, then blocks on wait_for_approval until a human decides.

The agent side

In the demo, the agent is an Amazon Bedrock model (Claude Sonnet 4.6) running a Converse API tool-use loop. The only thing that makes it "safe" is a system prompt and the MCP tools:

Before ANY high-risk action (refunds, deletions, external messages, anything
financial or destructive), you MUST call request_approval ... then call
wait_for_approval until you get a terminal status.
If approved with edited arguments, you MUST use the returned arguments.
If denied or expired, do NOT perform the action.

The agent pauses on its own. No orchestration framework required — just a tool call.

The approval card

When a request comes in, Aegis posts a Block Kit card to Slack. A human gets the agent, the action, the risk level, the exact arguments, and four buttons:

✅ Approve — the agent proceeds
❌ Deny — the agent safely aborts
✏️ Edit & Approve — the part I'm proudest of
❓ Request Info

The feature I didn't expect to love: Edit & Approve

A plain yes/no felt too blunt. Real reviewers don't just stop an agent — they correct it. So Edit & Approve opens a modal with the agent's arguments as editable JSON:

// src/blocks.ts
export function editModal(req) {
  return {
    type: "modal",
    callback_id: "aegis_edit_modal",
    title: { type: "plain_text", text: `Edit #${req.id}` },
    submit: { type: "plain_text", text: "Approve edited" },
    blocks: [{
      type: "input",
      element: {
        type: "plain_text_input",
        multiline: true,
        initial_value: JSON.stringify(req.args, null, 2),
      },
    }],
  };
}

In the demo, a reviewer lowers a refund from $1,200 to $800 and approves. The agent then executes the corrected amount and reports the change. Control, not just a veto.

Not everything needs a human

If every $45 goodwill refund paged a person, nobody would use this. So Aegis ships a small policy engine:

// src/policy.ts
if (multi.risks.includes(risk) || amount >= multi.minAmountUsd) {
  return { decision: "human", approvalsRequired: 2 };   // e.g. drop a prod table
}
if (auto.enabled && auto.risks.includes(risk) && amount <= auto.maxAmountUsd) {
  return { decision: "auto_approve" };                  // e.g. a $45 refund
}
return { decision: "human", approvalsRequired: 1 };

Auto-approve low-risk actions (🤖 instantly).
N-of-M for critical ones — dropping a production table needs two approvers.
TTL — anything left pending simply expires. Fail-safe by default.

Context, so it's a decision and not a rubber stamp

The thing I learned: a good approval is less about a button and more about context. Two touches:

Plain-language summary of every action via Amazon Bedrock (Claude Haiku 4.5), so reviewers don't read raw JSON.
Related Slack messages pulled onto the card with Slack's Real-Time Search API (assistant.search.context) — the relevant conversation is right there, with permalinks.

The fiddly bit: assistant.search.context needs a fresh action_token, which only arrives on a Slack assistant/mention event and lives for minutes — but approval requests originate outside any Slack event. I cache the latest token so out-of-band cards can still search.

How it all fits together

Any agent → request_approval over MCP → Aegis (policy + context + Block Kit) → Slack → human → the decision returns to the agent via wait_for_approval. Every step is written to an audit log.

What I'd tell past me

MCP is the right abstraction for guardrails. Because the gate is decoupled, it works with any agent.
"Human in the loop" is a UX problem, not just a yes/no. Summary + context + the ability to edit are what make it usable.
Put the control surface where people already are. Slack means zero new tooling for the approver.

Built for the Slack Agent Builder Challenge. Stack: TypeScript · Slack Bolt (Socket Mode) · MCP · Amazon Bedrock · Slack Real-Time Search API.

⭐ Code & setup: https://github.com/yama3133/aegis-slack-app
▶️ Demo: https://youtu.be/c1jqPDPo6AU