DEV Community

Authora Dev
Authora Dev

Posted on

Why multi-agent AI security is broken (and the identity patterns that actually work)

Last Tuesday, a “harmless” coding agent in staging opened a PR, fetched secrets from the wrong environment, and kicked off a deploy it was never supposed to touch.

Nothing “hacked” us. The agent did exactly what the system allowed.

That’s the part I think a lot of teams miss with multi-agent setups: the problem usually isn’t model quality. It’s identity.

Once you have more than one agent — planner, coder, reviewer, deployer, support bot, whatever — you need answers to very boring questions:

  • Who is this agent, exactly?
  • What is it allowed to do?
  • Can it act on behalf of someone else?
  • How do we prove what happened later?

If you don’t answer those, your “AI fleet” becomes a shared root account with vibes.

The pattern that breaks first: shared credentials

A lot of agent systems still look like this:

Agent A ----\
Agent B -----+----> same API key / same GitHub token / same MCP access
Agent C ----/
Enter fullscreen mode Exit fullscreen mode

It works great until:

  • one agent gets prompt-injected
  • one workflow needs narrower permissions
  • you need an audit trail
  • you want approvals for risky actions
  • you need to revoke one agent without breaking all of them

Shared credentials are convenient, but they destroy attribution and least privilege.

The identity pattern that actually works

The most reliable pattern we’ve seen is:

  1. Give each agent its own cryptographic identity
  2. Issue short-lived delegated access
  3. Enforce policy at the tool boundary
  4. Log every action with agent identity + delegation chain

In practice, it looks like this:

[Human/User]
    |
    | delegates task
    v
[Planner Agent] -- short-lived token --> [Coder Agent]
    |                                        |
    | policy check                           | calls tool / MCP server
    v                                        v
[Approval / Policy Engine] -------------> [GitHub, CI, Cloud, DB]

Audit log = who delegated what to whom, for which action, when
Enter fullscreen mode Exit fullscreen mode

That’s the difference between “an agent did something” and “the review agent, acting on behalf of the release workflow, was allowed to update only this repo for 10 minutes.”

What to implement first

You do not need a giant platform rollout to improve this.

1) Per-agent identity

Use a distinct identity for every agent process or role. Ideally, that identity is cryptographic, not just a string in config.

Ed25519 keys are a good fit here because they’re fast, small, and easy to verify.

Why it matters:

  • revocation is targeted
  • audit logs become useful
  • tools can verify the caller instead of trusting network location

2) Delegation, not credential sharing

If Agent A needs Agent B to perform work, don’t hand over a long-lived secret. Mint a scoped, short-lived token representing delegated rights.

OAuth token exchange / delegation-chain patterns are solid here. If you’re already using standards like RFC 8693, great. If not, even a simple internal delegation model is better than “just reuse the deploy token.”

3) Policy at the edge

Your tools should not trust every “internal” caller equally.

Put policy checks at the MCP server, gateway, or edge proxy:

  • this agent can read issues
  • that agent can open PRs
  • only approved agents can trigger deploys
  • production actions require human approval

If OPA fits your stack, use OPA. Seriously. You don’t need to reinvent policy engines for this.

4) Approval workflows for destructive actions

Treat delete, deploy, rotate, publish, and charge as special.

Agents are great at moving fast. That’s exactly why risky actions need explicit approval gates.

A tiny runnable example: generate an agent identity

Here’s a minimal Node example using Ed25519:

npm install tweetnacl tweetnacl-util
node agent-id.js
Enter fullscreen mode Exit fullscreen mode
// agent-id.js
const nacl = require("tweetnacl");
const util = require("tweetnacl-util");

const keypair = nacl.sign.keyPair();

const publicKey = util.encodeBase64(keypair.publicKey);
const secretKey = util.encodeBase64(keypair.secretKey);

console.log("Agent public key:", publicKey);
console.log("Store secret key securely:", secretKey.slice(0, 24) + "...");
Enter fullscreen mode Exit fullscreen mode

This isn’t a full identity system, but it’s the right direction: every agent gets its own keypair, and downstream systems verify who’s calling.

Common mistake: securing the model, not the workflow

Teams spend a lot of time on model guardrails and not enough on execution boundaries.

But in multi-agent systems, the blast radius usually comes from what the agent can do, not what it can say.

A secure fleet is mostly boring infrastructure:

  • identities
  • scoped tokens
  • policy checks
  • approvals
  • audit logs
  • isolation for untrusted execution

That’s true whether you’re orchestrating coding agents, support agents, or background task runners.

Try it yourself

If you want to tighten up your agent security without buying anything first:

These are useful starting points even if you end up building the rest yourself.

The big shift is simple: stop thinking of agents as “features” and start treating them like workloads with identities.

That’s when multi-agent systems become governable instead of mysterious.

How are you handling agent identity in your stack today? Drop your approach below.

-- Authora team

This post was created with AI assistance.

Top comments (0)