The Seventeen

Posted on Mar 22

The Security Checklist for Every AI Agent That Calls External APIs

#ai #security #webdev #python

Most AI agent security discussions focus on prompt injection in the abstract. This one is practical. If your agent calls external APIs, here is the specific list of things worth checking before it goes anywhere near production.

Credentials

The agent should not hold credential values.
If your agent reads os.environ.get("STRIPE_KEY") or retrieves a value from a secrets manager into a variable, the credential exists in the agent's execution context, accessible to the agent, to anything the agent spawns, and to any malicious instruction the agent can be given through external content.

The right architecture keeps the credential value outside the agent entirely: the agent passes a key name, the value resolves and injects at the transport layer, and the agent receives the API response. Nothing to extract at any step.

Credentials should not be in files the agent can read.
.env files, config files, any plaintext file in a directory the agent has access to. If the agent can read the filesystem and the credential is on the filesystem, the credential is reachable.

Team members should not share credential values directly.
Slack messages, emails, shared .env files. Each copy is an exposure point that cannot be revoked when someone leaves. Use a tool that shares encrypted access rather than shared values.

Network access

The agent should only be able to call domains it legitimately needs.
Deny-by-default domain allowlisting means the proxy blocks any outbound request to an unauthorized domain before credential resolution happens. A prompt injection attack that tries to redirect an authenticated call to an attacker-controlled server hits a wall at the proxy before a credential is ever involved.

In AgentSecrets, this is configured at the workspace level:

agentsecrets workspace allowlist add api.stripe.com api.openai.com

Any call to a domain not on this list is blocked before the credential is ever looked up.

API responses should be scanned for credential echoes.
Some APIs reflect authentication headers back in their response bodies. If an attacker can get the agent to call such an endpoint, the credential value may appear in the response the agent receives. Automatic response redaction catches this before the agent sees anything.

Audit trail

Every API call the agent makes should be logged.
Key name, endpoint, method, status code, timestamp, duration. Not the credential value, which should never appear in any log, but enough to reconstruct what the agent did and when.

The log should be queryable by agent identity.
In a multi-agent system, "which agent made this call" is a question you will eventually need to answer. If every log entry is anonymous, incident response becomes significantly harder.

The log should capture policy state, not just outcomes.
A log entry that records what happened is useful. A log entry that also records what the agent was permitted to do at the time it happened is forensically useful. If the allowlist changes after an incident, you still want to know what it was during the incident.

Agent identity

Named agents should have verified identities.
An agent that can assert any name it wants provides weak accountability. Issued identity tokens that the proxy verifies cryptographically mean a log entry is bound to the specific registration that made the call, not to whatever the agent claims.

Tokens should be revocable per agent, per environment.
One token per deployment context. If the production token is compromised, revoke it without affecting staging. If a developer leaves, revoke their agent tokens without rebuilding the workspace.

The self-assessment

Go through this list for the agents you are running today:

Where does the credential value exist at the moment your agent makes an API call?
Can your agent read any file that contains a credential value?
If your agent were given a malicious instruction to exfiltrate credentials, would anything stop it?
Can you name every domain your agent is permitted to call?
If something unexpected appears in your logs, can you tell which agent did it?
If a team member leaves today, can you revoke their access to all credentials without sharing new values manually?

Uncomfortable answers are worth addressing before the agent handles anything sensitive.

AgentSecrets addresses all of these at the architecture level. The full security model is documented at agentsecrets.theseventeen.co. The repository is at github.com/The-17/agentsecrets.
See how it's being built at engineering.theseventeen.co

Top comments (2)

Adarsh Kant • Mar 22

This is incredibly relevant for anyone building AI agents that interact with the real world. We deal with this exact challenge at AnveVoice — our voice AI doesn't just call APIs, it takes real DOM actions on websites (clicking buttons, filling forms, navigating pages).

The security surface area expands dramatically when your AI agent can manipulate a live webpage. Some additions from our experience:

Action sandboxing — We scope every DOM action to a whitelist of allowed selectors per client. The AI can't interact with elements outside the defined boundary.
Rate limiting at the action level — Not just API calls, but actual clicks/form fills per session. Prevents runaway automation loops.
Audit trails — Every voice command → DOM action is logged with before/after DOM snapshots. Critical for debugging and compliance.
Input sanitization for voice — Voice-to-text can produce unexpected strings. We sanitize everything before it touches a form field.

Great checklist — security in AI agents is still massively underrated. Bookmarking this.

The Seventeen • Mar 23

This is a genuinely interesting extension of the problem. The checklist was written with API calls in mind but what you're describing expands the threat surface in ways that make credential security look simple by comparison.

What does AnveVoice's threat model look like for prompt injection specifically? Curious whether a malicious element on the page like hidden text or an aria-label can redirect the agent's intended action.