DEV Community

Amit Malhotra
Amit Malhotra

Posted on

AI Agents Security: Why Your Framework Needs an Update

AI Agents Are Infrastructure, Not Magic — And Your Security Framework Needs to Catch Up

Most organisations treat AI agents as a special category of software that somehow exists outside their normal security governance. That assumption is already causing problems in production.

I've spent the last year advising SaaS teams across Canada and the US on GCP platform architecture, and the pattern I keep seeing is concerning: agents deployed with over-privileged service accounts, no audit trail of autonomous decisions, and compliance teams discovering months later that they have no idea what the agent actually did.

An AI agent can read your data, write to your systems, call external APIs, and trigger downstream actions — autonomously, at machine speed. The governance questions that matter aren't about AI ethics. They're about infrastructure security: who authorised this action, what data did the agent access, what happens when it makes a wrong decision, and how do you detect when it's been manipulated.

The Governance Gap Most Teams Haven't Noticed

The problem isn't that teams are being careless. It's that existing security frameworks were designed for deterministic software. Traditional applications have predictable behaviour paths. You can audit the code, understand the decision logic, and scope permissions to exactly what the application needs to do.

Agents break this model. An LLM-powered agent's behaviour depends on prompts, context, and learned patterns that shift based on input. The same agent with the same code can take completely different actions depending on what data it reads or what instructions it receives.

I've seen agents deployed to production with no IAM boundary — the same service account used for agent execution and data access, with no separation. One agent I reviewed was authorised to delete GCP resources based on LLM reasoning alone, with no human approval gate. Another was calling external APIs without egress controls, sending data to third-party LLM providers with PIPEDA implications the team hadn't considered.

The compliance team showed up three months after deployment asking for an audit trail of every decision the agent had made. It didn't exist.

Agents Are Attack Surface, Not Just Automation

Here's what most teams miss: an AI agent is a new category of attack surface, not just a new category of automation.

The prompt injection problem is well-documented for chatbots. But agent prompt injection is more dangerous because agents don't just generate text — they take actions. And the injection vector isn't always user input. I've seen agents compromised through data they read from internal systems. An agent reads a document containing malicious instructions, interprets them as legitimate commands, and executes them.

This is why the Security-by-Design principle from the SCALE framework matters more for agents than almost any other infrastructure component. If identity boundaries are wrong at deployment, no amount of monitoring will protect you later.

The blast radius of a compromised agent depends entirely on the permissions you gave it. An agent with roles/owner on a project can do anything. An agent scoped to read access on a single BigQuery dataset can do almost nothing, even if fully compromised.

What Actually Works in Production

The governance framework for agents is the same as any other automated system — least privilege, audit trail, blast radius control. The difference is that the failure modes are harder to predict, which makes governance more important, not optional.

IAM scoping that assumes compromise. Agent service accounts should have only the permissions needed for their specific task, scoped to specific resources. Separate the SA for agent execution from the SA for data access. Use impersonation chains, not a single over-privileged account.

# Agent execution SA impersonates data-access SA
gcloud iam service-accounts add-iam-policy-binding \
  data-reader@project.iam.gserviceaccount.com \
  --member="serviceAccount:agent-executor@project.iam.gserviceaccount.com" \
  --role="roles/iam.serviceAccountTokenCreator"
Enter fullscreen mode Exit fullscreen mode

Human-in-the-loop for operations you can't reverse. Implement approval gates for high-risk actions using callback functions in your agent framework:

def before_tool_call(tool_name, tool_input, context):
    if tool_name in HIGH_RISK_TOOLS:
        approval = request_human_approval(tool_name, tool_input)
        if not approval:
            raise PermissionError(f"Human approval required for {tool_name}")
Enter fullscreen mode Exit fullscreen mode

This slows agents down. That's the point. Reserve approval gates for operations with irreversible consequences — delete, deploy to production, send external communication.

Structured audit logging of every agent action. Every tool call should generate a structured log entry with agent ID, input, output, and timestamp. This isn't optional when your compliance team inevitably asks what the agent was doing for the last quarter.

VPC Service Controls as a containment boundary. A VPC-SC perimeter around the agent's GCP resource access prevents data exfiltration even if the agent is compromised. Egress controls for external API calls — Cloud NAT with fixed IPs plus firewall rules — limit where the agent can send data.

Model Armor as a guardrail layer. Policy-based filtering of agent inputs and outputs catches known attack patterns before they reach the agent or after the agent generates a response.

The Trade-offs Are Real

Strict IAM scoping can break agent functionality in ways that are hard to debug. Agents fail silently when permissions are missing. You need to test permission boundaries explicitly before production deployment, not discover them through user complaints.

Human-in-the-loop defeats the purpose for high-volume automated workflows. If your agent handles 10,000 operations per day and each one requires human approval, you don't have an agent — you have a very expensive suggestion engine. Match approval gates to actual risk, not theoretical risk.

Full audit logging of LLM inputs and outputs is expensive. It also raises data retention questions — are you storing customer data in those logs? For how long? Under what jurisdiction? Define your retention policy before enabling verbose logging, not after your storage bill arrives.

The Business Reality

The companies that will get this right are the ones that treat AI agents as infrastructure components, not as a special category that exists outside their security perimeter.

The audit risk is real. SOC 2 auditors are already asking about AI governance. If you can't explain what your agent is authorised to do and provide an audit trail of what it actually did, you have a finding.

The operational risk is real. An agent with roles/owner that hallucinates a cleanup operation can delete production resources before anyone notices.

The compliance risk is real. An agent sending data to a US-based LLM API without egress controls creates PIPEDA implications for Canadian companies that most teams haven't thought through.

The framework isn't complicated. Least privilege. Audit trail. Blast radius control. Human oversight for irreversible actions. The same principles you apply to any automated system.

The difference is that AI agents fail in less predictable ways. That doesn't mean governance is impossible. It means governance is mandatory.

What patterns have you seen break this approach in production?


Work with a GCP specialist — book a free discovery call

Amit Malhotra

Principal GCP Architect, Buoyant Cloud Inc


Work with a GCP specialist — book a free discovery callhttps://buoyantcloudtech.com

Top comments (0)