If you’re letting AI agents call tools, open pull requests, touch production data, or coordinate work across services, you already have an identity problem.
A lot of agent systems still rely on soft trust: API keys in environment variables, tool access based on network location, or a vague assumption that “the agent running in this session is the same one we started with.” That works right up until it doesn’t. An agent gets replayed, a tool call is spoofed, a session token leaks, or a delegated workflow quietly gains more access than intended.
That’s agent hijacking in practice: an attacker, buggy integration, or misconfigured workflow causes actions to be executed by the wrong agent, with the wrong permissions, and without a reliable way to prove what happened.
The fix is not “more prompts.” It’s the same thing we’ve learned in every other security domain: strong identity, least privilege, and auditable authorization.
What agent hijacking actually looks like
In most real systems, hijacking doesn’t mean a dramatic Hollywood-style takeover. It usually looks more boring:
- A long-lived API key gets reused by multiple agents
- An MCP server trusts any client that can connect
- An agent delegates a task, but the delegated worker inherits full upstream privileges
- Tool calls aren’t signed, so you can’t prove which agent initiated them
- Approval workflows happen in Slack or email with no cryptographic binding to the final action
- Logs tell you what happened, but not who actually authorized it
Once agents are acting on your behalf, “close enough” identity stops being enough.
Why API keys and shared service accounts break down
A shared service account can identify an application. It does not identify an individual agent execution, a delegated subtask, or a bounded approval chain.
For agents, you usually need to answer questions like:
- Which agent requested this tool call?
- Was it the original planner agent or a delegated worker?
- What exact permissions did it have at the time?
- Did a human approve this step?
- Can I revoke this one agent without breaking the whole system?
API keys are bad at this because they’re typically:
- static
- shared
- over-scoped
- hard to rotate
- disconnected from execution context
A better model is to give each agent a cryptographic identity, then enforce RBAC or policy-based access at the tool boundary.
The security baseline: cryptographic identity + RBAC
At a minimum, an agent platform should support:
- A unique cryptographic identity per agent
- Signed requests or assertions
- Short-lived delegated credentials
- Role-based access control
- Audit logs tied to identity, not just sessions
A practical implementation often uses public-key cryptography such as Ed25519 for agent identity. That gives you a keypair per agent, where the private key signs requests and the public key verifies them.
Then layer authorization on top:
-
reader: can query docs or status APIs -
coder: can read/write specific repos -
reviewer: can comment, not merge -
deployer: can trigger staging deploys -
approver-required: can execute only with human approval
This is where RBAC still shines. It’s understandable, debuggable, and usually enough to get started. If your environment is more dynamic, a policy engine like OPA is a good fit. If OPA matches your stack and team skills, use it.
A simple pattern for signed agent actions
Here’s the rough shape of what you want:
- Agent gets a keypair
- Agent requests a short-lived token with role claims
- Agent signs a tool request
- Tool verifies both:
- the signature
- the role or policy claims
Example in TypeScript using Ed25519-style signing:
import nacl from "tweetnacl";
import { encodeBase64, decodeBase64 } from "tweetnacl-util";
const keyPair = nacl.sign.keyPair();
const requestBody = JSON.stringify({
tool: "create_pull_request",
repo: "acme/api",
branch: "agent/fix-auth",
role: "coder",
timestamp: Date.now()
});
const signature = nacl.sign.detached(
Buffer.from(requestBody),
keyPair.secretKey
);
const envelope = {
body: requestBody,
signature: encodeBase64(signature),
publicKey: encodeBase64(keyPair.publicKey)
};
console.log(envelope);
Verification on the tool side:
import nacl from "tweetnacl";
import { decodeBase64 } from "tweetnacl-util";
function verifyEnvelope(envelope: {
body: string;
signature: string;
publicKey: string;
}) {
return nacl.sign.detached.verify(
Buffer.from(envelope.body),
decodeBase64(envelope.signature),
decodeBase64(envelope.publicKey)
);
}
This only proves the message was signed by the holder of the private key. In production, you still need to bind that key to:
- a registered agent identity
- a role set
- a trust chain
- an expiration time
- optionally, a delegation chain
That’s where identity infrastructure matters.
Delegation is where things get dangerous
Many agent systems are multi-step by design:
- planner agent receives a goal
- planner delegates coding to a worker
- worker delegates testing to another worker
- final deployment requires approval
If every delegated agent gets the parent’s full permissions, you’ve built a privilege escalation machine.
Instead, use bounded delegation:
- short-lived delegated tokens
- narrowed scopes
- explicit audience restrictions
- traceable chains of who delegated to whom
Standards like RFC 8693 token exchange are useful here. The important idea is simple: a delegated worker should receive less access than its parent, not more.
For example:
- Planner can access
repo:read,repo:write,deploy:staging - Test worker gets only
repo:read,ci:run - Docs worker gets only
docs:write - No worker gets
deploy:proddirectly
That one design choice dramatically reduces the blast radius of a hijacked sub-agent.
MCP servers need zero-trust thinking
MCP is making tool use easier, but it also creates a bigger attack surface. If an MCP server assumes any connected client is trusted, it becomes a soft target.
A safer MCP model includes:
- authenticating the calling agent
- verifying cryptographic identity
- checking role/policy before every tool invocation
- logging decisions with actor identity
- requiring approvals for high-risk tools
If you’re exposing an MCP server internally or publicly, treat it like any other production API: authenticate every request, authorize every action, and assume the network is hostile.
Getting started: a practical rollout plan
You do not need to rebuild your whole stack this week. Start with the highest-risk path.
1. Inventory agent actions
List what your agents can actually do:
- read code
- write code
- open PRs
- access tickets
- query internal docs
- run CI
- deploy
- touch customer data
This gives you the first draft of roles and scopes.
2. Split identities
Stop sharing one credential across multiple agents or workflows.
Each agent, worker, or execution context should have its own identity. Even if you start with a simple key registry, that’s better than one giant service account.
3. Add least-privilege roles
Define a small RBAC matrix before you add complexity:
roles:
reader:
- docs:read
- repo:read
coder:
- repo:read
- repo:write
- pr:create
tester:
- repo:read
- ci:run
deployer:
- deploy:staging
If your rules depend heavily on environment, repo, branch, or data sensitivity, move to OPA or another policy engine.
4. Use short-lived delegation
When one agent spawns another, mint a short-lived delegated credential with reduced privileges. Avoid passing parent credentials downstream.
5. Log identity and authorization decisions
Your logs should answer:
- which agent acted
- what it tried to do
- what role or policy allowed it
- whether approval was required
- whether delegation was involved
If you can’t reconstruct that chain later, incident response will be painful.
Where Authora fits
If you’re building this yourself, the core ideas still apply: Ed25519 identities, scoped delegation, RBAC or OPA-backed policy, and verified tool access.
Authora works in this area with agent identity, authorization, delegation chains, MCP authorization, and auditability, but the main point here is architectural: agents need first-class identity and access control. Whether you implement that with your own stack, OPA, or a platform, the security model matters more than the branding.
Try it yourself
A few free tools that can help immediately:
- Want to check your MCP server? Try https://tools.authora.dev
- Run
npx @authora/agent-auditto scan your codebase - Add a verified badge to your agent: https://passport.authora.dev
- Check out https://github.com/authora-dev/awesome-agent-security for more resources
The biggest mindset shift is this: stop treating agents like invisible application glue, and start treating them like security principals.
Once an agent can act, it needs identity.
Once it has identity, it needs authorization.
And once it has authorization, you need a way to prove what happened.
That’s how you prevent hijacking from turning into a production incident.
-- Authora team
This post was created with AI assistance.
Top comments (0)