DEV Community

Cover image for Your AI agents are probably over-privileged and under-monitored
Beatriz Albernaz
Beatriz Albernaz

Posted on

Your AI agents are probably over-privileged and under-monitored

You've got an AI agent running in production. It has an API key for your database, OAuth tokens for your cloud provider, and access to your customers' data across multiple tenants.

When did you last rotate those credentials?

If the answer is "when we set it up," you're not alone, but you've got a real exposure on your hands.

Why human IAM patterns break for AI agents

Most teams I see apply the same identity model to AI agents that they use for human users. It doesn't hold up.

Human users log in, do a thing, log out. Sessions are bounded. Permissions map to job roles. Rotation happens on a schedule everyone tolerates.

AI agents are different in every one of those dimensions:

  • They run continuously, not in discrete sessions
  • They need permissions that change based on task context, not a static role
  • They generate and consume credentials programmatically, without human review
  • They operate across multiple customer tenants in the same workflow
  • A compromised agent credential doesn't just expose data, it takes actions on that data

The blast radius isn't the same. A leaked human user token is bad. A leaked agent credential that has read-write access to your entire database and can execute tool calls across your environment is worse.

What we actually find in AI red teaming engagements

40–50% of high-severity findings in our AI red teaming assessments are identity-related. These are the patterns that come up most:

Static, long-lived credentials
Agents deployed with API keys that haven't rotated since initial setup. Six months is common. The reasoning is usually "rotation would break things" which is true, and also a sign the credential management wasn't designed for automation.

Over-permissioned tool access
An agent needs to query one table. It has credentials for the entire database. This happens because scoping credentials is tedious to do properly, and "it works" is good enough at ship time.

Cross-tenant context leakage
Multi-tenant SaaS products where agent context from one customer bleeds into another customer's thread. Usually a missing tenant ID check somewhere in the tool call chain. Easy to miss in code review, easy to find in adversarial testing.

Secrets in places agents can read them
Environment variables, config files, conversation context. Agents are designed to access these things. An attacker's job is to make your agent hand them over through prompt injection or tool abuse.

What good agent identity looks like

None of this is exotic. It's standard security hygiene adapted for non-human principals.

Short-lived tokens, automated rotation

Credentials for AI agents should expire fast under an hour for anything sensitive. Rotation needs to be automated because you're not doing it manually at that frequency.

# Example: generating short-lived AWS credentials for an agent task
aws sts assume-role \
  --role-arn arn:aws:iam::123456789:role/agent-task-role \
  --role-session-name agent-session-$(date +%s) \
  --duration-seconds 3600
Enter fullscreen mode Exit fullscreen mode

Least privilege, scoped per task

RBAC is too coarse for agents. You want attribute-based or policy-based access control that evaluates context at request time, not a role that's "good enough for most things the agent does."

Define what each agent needs for each task. Grant that. Nothing else.

Inventory your agents

You probably don't have a complete list of which agents are running, what credentials they use, and what they can access. Start there. You can't secure what you haven't inventoried.

A basic agent registry entry per deployment:

  • Agent name and version
  • Credentials in use (reference, not value)
  • Scope of access
  • Customer tenants it operates in
  • Last rotation date

Log actions, not just outputs

Your observability stack is probably capturing what the agent says. You also need to capture what it does: which tools it called, which files it read, which APIs it hit, which tenant's data it touched.

This is how you distinguish legitimate agent behavior from a compromised agent doing the same things.

Test tenant isolation explicitly

If you're multi-tenant, add automated tests that verify agents cannot access resources outside their assigned tenant scope. Run these in CI. Don't rely on code review alone, the failure modes are subtle and adversarial testing finds them reliably.

Before your next SOC 2 audit or enterprise deal

Enterprise customers are going to ask about this. Security questionnaires increasingly include questions about AI agent access controls, credential lifecycle, and tenant isolation.

The teams that are ahead of this have:

  1. A separate budget line for agent identity security (not absorbed into AI innovation spend)
  2. Automated credential rotation tied to their deployment pipeline
  3. Fine-grained policies scoped per agent task, not per agent role
  4. Audit logs capturing agent actions with full tenant context
  5. External validation through red teaming before major audits or sales cycles

If you haven't done this work yet, the best time to start is before the questionnaire lands, not after.


We run AI red teaming engagements for B2B SaaS companies building AI-powered features: prompt injection, tool abuse, cross-tenant isolation, credential management. If you want to know what's actually exposed before an attacker does, scope an engagement.

Top comments (2)

Collapse
 
harjjotsinghh profile image
Harjot Singh

Over-privileged and under-monitored is the precise diagnosis, and the two halves reinforce each other into the worst case. Over-privileged because it's easier to hand the agent broad credentials than to scope them, so it can touch far more than its task needs. Under-monitored because agent actions look like normal API traffic, so nobody's watching what it actually did. Combine them and you've got a component that can do a lot and is observed by no one - which is exactly the setup for a quiet disaster, whether from a prompt injection, a hallucinated action, or just a confidently wrong decision nobody caught. We learned least-privilege and audit logging for human/service accounts decades ago; agents somehow got grandfathered out of both.

This is core to how I build - least privilege and full auditability aren't optional for autonomous components. It's baked into Moonshift, the thing I work on: a multi-agent pipeline that takes a prompt to a deployed SaaS, where each agent gets narrow scoped capabilities and every action is gated by a verify layer and observable, so an agent can't quietly do something it shouldn't. Same fix you're prescribing. Multi-model routing keeps a build ~$3 flat, first run free no card. Important post - this is the gap that'll cause the first real agent incident. What do you push first: scoping the permissions down, or getting monitoring/audit in place? I'd argue scope first, because you can't monitor your way out of god-mode.

Collapse
 
albernaz_ profile image
Beatriz Albernaz

Scope first agreed, and the red teaming data backs it up. The monitoring gap is real, but you can log everything and still miss the attack if the agent had permission to do it in the first place. The log just tells you it happened, not that it was wrong.
What you're building with Moonshift narrowing scoped capabilities plus a verifying gate is how you contain the blast radius before you even need the audit trail. Curious how you handle the verify layer for tool calls that are ambiguous in intent but technically within scope.