Why You Should Never Let an LLM Decide Your AI Agent's Permissions

#agents #ai #llm #security

If you've ever handed the decision‑making about what your AI agent can and cannot do to a large language model (LLM), you might be handing over the keys to the kingdom. In production systems, an LLM can be impressively creative, but it doesn't understand the safety policies you need to enforce. In this article I share a practical, first‑person walkthrough of why you should never let an LLM decide an agent's permissions, and how to implement a lightweight, auditable permission framework for your agents.

The Problem: LLMs Aren’t Security Gatekeepers

LLMs are trained to predict the next token, not to evaluate risk. When you ask an LLM to "figure out what a user is allowed to do" you get a plausible‑sounding answer, but the model has no notion of principle‑of‑least‑privilege, compliance rules, or even your company’s internal policy hierarchy. In a recent internal test I let Claude‑3‑Opus suggest permission sets for a data‑extraction agent. The model happily gave the agent full admin access to the storage bucket, which would have opened a massive data‑exfiltration surface.

Real‑world consequences

Privilege escalation – An LLM can unintentionally grant write access to a read‑only resource.
Compliance violations – GDPR‑style data‑subject requests can be ignored if the model doesn't understand legal constraints.
Unexpected costs – Granting unrestricted network access can cause runaway token usage on external APIs.

The takeaway? An LLM is a great collaborator, not a policy enforcer.

A Simple Permission Model You Can Deploy Today

Instead of trusting the model, I built a tiny JSON‑based policy language that lets you define what an agent may do, where, and under which conditions. The policy is evaluated before the LLM is invoked, guaranteeing that the model only operates within safe bounds.

// agent-policy.json
{
  "agent_name": "data_extractor",
  "allowed_actions": ["read", "list"],
  "resource_patterns": ["s3://my‑bucket/reports/*"],
  "max_runtime_seconds": 30,
  "rate_limit": {
    "calls_per_minute": 60
  }
}

The policy is deliberately declarative: it lists actions, resource globs, and auxiliary constraints. No code is executed at this point, making it easy to review and audit.

Enforcing Policies with a Tiny Python Wrapper

I wrapped the policy in a Python module that checks the request against the policy before delegating to the LLM. Below is the core of the enforcement logic.

import json, fnmatch, time
from pathlib import Path

class PolicyError(RuntimeError):
    pass

class AgentPolicy:
    def __init__(self, policy_path: str):
        self.policy = json.loads(Path(policy_path).read_text())
        self._last_call = 0
        self._calls_this_minute = 0

    def _rate_limit(self):
        now = time.time()
        # Reset every minute
        if now - self._last_call > 60:
            self._calls_this_minute = 0
            self._last_call = now
        if self._calls_this_minute >= self.policy["rate_limit"]["calls_per_minute"]:
            raise PolicyError("Rate limit exceeded")
        self._calls_this_minute += 1

    def check(self, action: str, resource: str, runtime: int):
        # Action whitelist
        if action not in self.policy["allowed_actions"]:
            raise PolicyError(f"Action '{action}' not permitted")
        # Resource glob check
        if not any(fnmatch.fnmatch(resource, pat) for pat in self.policy["resource_patterns"]):
            raise PolicyError(f"Resource '{resource}' outside allowed patterns")
        # Runtime cap
        if runtime > self.policy["max_runtime_seconds"]:
            raise PolicyError("Requested runtime exceeds policy limit")
        # Rate‑limit enforcement
        self._rate_limit()
        return True

Usage example

policy = AgentPolicy('agent-policy.json')

# Pretend the LLM wants to read from a bucket for 10 seconds
try:
    policy.check(action='read', resource='s3://my-bucket/reports/q1.csv', runtime=10)
    # Safe – now invoke the LLM to extract data
    result = llm.run(prompt='Extract the numbers from the CSV...')
except PolicyError as e:
    print('Policy violation:', e)

If the LLM suggested a forbidden action (e.g., delete), the wrapper aborts before any external call occurs. The policy enforcement adds only a few milliseconds of overhead, but it protects you from catastrophic mistakes.

Automating Audits & Continuous Improvement

Because the policy file is plain JSON, you can version‑control it alongside your code. I set up a CI job that runs a static‑analysis test on every PR:

Parse the policy with a schema validator.
Ensure no * wildcards appear in resource_patterns for production agents.
Verify that max_runtime_seconds never exceeds 60 for agents accessing external APIs.

The audit logs from the wrapper (written to stderr) are shipped to a monitoring dashboard, giving you a live view of policy violations. Over time, you can tighten the policy as you learn about real‑world usage patterns.

What I Learned

Never delegate authority to an LLM. Even a well‑trained model can hallucinate permissive settings.
A tiny declarative policy layer adds a security “guardrail” with virtually no runtime cost.
First‑person production experience matters. My own misstep—letting an LLM grant admin bucket access—highlighted the need for a systematic approach.
Version‑control your policies just like code. Audits become trivial, and you can roll back a risky change instantly.

By keeping the LLM inside a sandbox of explicit permissions, you reap the creative benefits of AI while keeping your system compliant, cost‑effective, and safe.

If you found this guide useful, feel free to share your own permission‑policy experiences in the comments. Let’s build AI agents that are both smart **and* secure.*