DEV Community

HK Lee
HK Lee

Posted on • Originally published at pockit.tools

AI Agent Authentication & Authorization: How to Secure Tool Calls, OAuth Scopes, and Permissions in Production

Your AI agent just sent a Slack message to 14,000 customers. A production support agent, designed to look up order statuses, was prompt-injected into accessing the bulk messaging API. It had the credentials. It had the permissions. Nobody approved it. The agent acted within its technical authorization — it just wasn't supposed to do that.

This is the new frontier of AI security. We've spent years hardening web applications against SQL injection, XSS, and CSRF. Now we're deploying autonomous systems that hold API keys, OAuth tokens, and database credentials — systems that make their own decisions about which tools to call and what data to access. The attack surface isn't the model's weights. It's the execution layer: the credentials, permissions, and tool access granted to agents that can be manipulated through prompt injection, goal hijacking, or simple misconfiguration.

This guide covers the complete security architecture for production AI agents: identity management, OAuth 2.1 delegated authorization, scoped tool permissions, MCP gateway enforcement, human-in-the-loop patterns, and the defense-in-depth strategies that separate a secure agent deployment from an incident waiting to happen.

Why Traditional Auth Doesn't Work for AI Agents

You wouldn't give an intern the root AWS credentials and say "use your best judgment." Yet that's essentially what many teams do with AI agents. Here's why the traditional service account model breaks down:

Agents Are Non-Deterministic Actors

A traditional microservice makes the same API calls every time. You can audit its behavior by reading its source code. An AI agent is fundamentally different:

Traditional Service:
  Input: "Get order #12345"
  → Always calls: GET /api/orders/12345
  → Predictable, auditable

AI Agent:
  Input: "Help this customer with their order"
  → Might call: GET /api/orders/12345
  → Might call: POST /api/refunds
  → Might call: PUT /api/customer/email
  → Might call: DELETE /api/orders/12345
  → Non-deterministic, context-dependent
Enter fullscreen mode Exit fullscreen mode

The agent decides which tools to invoke based on its reasoning at runtime. Static RBAC policies that worked for microservices can't handle this — you need dynamic, context-aware authorization.

The Blast Radius Problem

When a traditional service is compromised, the damage is bounded by its fixed functionality. When an AI agent is compromised (or manipulated), the blast radius equals its entire permission set:

Factor Microservice AI Agent
Actions Fixed, predefined Dynamic, model-decided
Attack vector Code exploits Prompt injection, goal hijacking
Blast radius Single function All granted permissions
Audit trail Deterministic logs Requires reasoning trace
Access pattern Predictable Context-dependent

An agent with broad permissions becomes a universal attack surface. Every tool it can access is a tool an attacker can potentially invoke through the agent.

The Credential Lifecycle Mismatch

Most service accounts use long-lived credentials — API keys that rotate quarterly, service tokens with no expiration. For a deterministic service, this is (somewhat) acceptable. For an agent that might be manipulated in real-time:

// ❌ How most teams deploy agents today
const agent = new Agent({
  openaiKey: process.env.OPENAI_API_KEY,
  stripeKey: process.env.STRIPE_SECRET_KEY,    // Full access
  dbConnection: process.env.DATABASE_URL,       // Read + Write
  slackToken: process.env.SLACK_BOT_TOKEN,      // All channels
  awsCredentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY,     // IAM admin??
    secretAccessKey: process.env.AWS_SECRET_KEY,
  },
});
// This agent has the keys to the kingdom
Enter fullscreen mode Exit fullscreen mode

If the agent is prompt-injected, every one of these credentials is now in play.

The Agent Identity Model

The first step to securing agents is treating them as Non-Human Identities (NHIs) — not extensions of user accounts, not shared service accounts, but first-class identity principals.

Unique Agent Identity

Every agent instance should have its own identity, not share credentials with other agents or services:

interface AgentIdentity {
  agentId: string;            // Unique identifier
  agentType: string;          // e.g., 'customer-support', 'data-analyst'
  version: string;            // Agent version for audit
  deploymentEnv: string;      // 'production' | 'staging' | 'development'
  owner: string;              // Team responsible
  createdAt: Date;
  expiresAt: Date;            // Mandatory expiration
  maxConcurrentSessions: number;
  allowedTools: string[];     // Whitelist of permitted tools
  deniedTools: string[];      // Explicit blacklist
}

// Register agent identity at deployment
const identity = await identityProvider.register({
  agentType: 'customer-support',
  version: '2.4.1',
  owner: 'support-team',
  expiresAt: new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h
  allowedTools: [
    'lookup_order',
    'check_shipping_status',
    'create_support_ticket',
  ],
  deniedTools: [
    'issue_refund',        // Requires human approval
    'delete_account',      // Never automated
    'bulk_message',        // Never automated
  ],
});
Enter fullscreen mode Exit fullscreen mode

Short-Lived, Scoped Credentials

Kill static API keys. Every agent session should use credentials that are:

  1. Time-bound: Expire after minutes or hours, not months
  2. Scope-limited: Only grant access to the specific tools needed
  3. Session-tied: Bound to a specific agent session, not the agent type
class AgentCredentialManager {
  async getSessionCredentials(
    agentIdentity: AgentIdentity,
    sessionContext: SessionContext
  ): Promise<ScopedCredentials> {
    // Request short-lived token from identity provider
    const token = await this.idp.issueToken({
      subject: agentIdentity.agentId,
      audience: 'tool-gateway',
      scopes: this.resolveScopes(agentIdentity, sessionContext),
      expiresIn: '15m',           // 15-minute sessions
      sessionId: sessionContext.id,
      constraints: {
        maxToolCalls: 50,          // Hard limit per session
        allowedIPs: ['10.0.0.0/8'], // Network restrictions
        rateLimit: '100/minute',
      },
    });

    return {
      token,
      refreshToken: null,  // No refresh — get new session
      expiresAt: token.expiresAt,
    };
  }

  private resolveScopes(
    identity: AgentIdentity,
    context: SessionContext
  ): string[] {
    // Dynamic scope resolution based on context
    const baseScopes = identity.allowedTools.map(
      (t) => `tool:${t}:execute`
    );

    // Elevate or restrict based on user context
    if (context.userTier === 'enterprise') {
      baseScopes.push('tool:priority_support:execute');
    }

    // Time-based restrictions
    const hour = new Date().getHours();
    if (hour < 6 || hour > 22) {
      // After hours: read-only mode
      return baseScopes.filter((s) => !s.includes('write'));
    }

    return baseScopes;
  }
}
Enter fullscreen mode Exit fullscreen mode

OAuth 2.1 for Delegated Agent Authorization

When an AI agent acts on behalf of a user, it needs delegated authorization — the user explicitly grants the agent limited, time-bound access to their resources. This is exactly what OAuth 2.1 was designed for.

The OAuth Flow for Agents

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│   User   │    │  Agent   │    │   Auth   │    │ Resource │
│          │    │ Gateway  │    │  Server  │    │  Server  │
└────┬─────┘    └────┬─────┘    └────┬─────┘    └────┬─────┘
     │               │               │               │
     │ "Help me with │               │               │
     │  my order"    │               │               │
     │──────────────>│               │               │
     │               │               │               │
     │  Auth needed  │               │               │
     │<──────────────│               │               │
     │               │               │               │
     │ Login + Grant │               │               │
     │ scoped access │               │               │
     │──────────────────────────────>│               │
     │               │               │               │
     │               │  Scoped token │               │
     │               │<──────────────│               │
     │               │               │               │
     │               │  API call     │               │
     │               │  with token   │               │
     │               │──────────────────────────────>│
     │               │               │               │
     │               │  Response     │               │
     │               │<──────────────────────────────│
     │               │               │               │
     │  Agent uses   │               │               │
     │  result only  │               │               │
     │<──────────────│               │               │
Enter fullscreen mode Exit fullscreen mode

Implementation: OAuth 2.1 + PKCE for Agents

import { AuthorizationCode } from 'simple-oauth2';

class AgentOAuthManager {
  private client: AuthorizationCode;

  constructor() {
    this.client = new AuthorizationCode({
      client: {
        id: process.env.AGENT_CLIENT_ID!,
        secret: '', // Public client — no secret
      },
      auth: {
        tokenHost: process.env.AUTH_SERVER_URL!,
        authorizePath: '/authorize',
        tokenPath: '/token',
      },
    });
  }

  async initiateUserConsent(
    userId: string,
    requiredScopes: string[]
  ): Promise<ConsentRequest> {
    // Generate PKCE challenge
    const codeVerifier = crypto.randomBytes(32)
      .toString('base64url');
    const codeChallenge = crypto
      .createHash('sha256')
      .update(codeVerifier)
      .digest('base64url');

    const authorizationUrl = this.client.authorizeURL({
      redirect_uri: process.env.AGENT_CALLBACK_URL,
      scope: requiredScopes.join(' '),
      state: crypto.randomUUID(),
      code_challenge: codeChallenge,
      code_challenge_method: 'S256',
    });

    // Store verifier for token exchange
    await this.storeSession(userId, { codeVerifier });

    return {
      consentUrl: authorizationUrl,
      scopes: requiredScopes,
      expiresIn: 300, // 5 minutes to complete
    };
  }

  async exchangeCode(
    userId: string,
    authorizationCode: string
  ): Promise<AgentToken> {
    const session = await this.getSession(userId);

    const tokenResponse = await this.client.getToken({
      code: authorizationCode,
      redirect_uri: process.env.AGENT_CALLBACK_URL,
      code_verifier: session.codeVerifier,
    });

    return {
      accessToken: tokenResponse.token.access_token,
      expiresAt: new Date(tokenResponse.token.expires_at),
      scopes: tokenResponse.token.scope.split(' '),
      // No refresh token — agent must re-request consent
    };
  }
}
Enter fullscreen mode Exit fullscreen mode

Granular Scope Design

Design scopes that are narrow enough to enforce least privilege:

// ❌ BAD: Overly broad scopes
const scopes = ['orders:full', 'customers:full', 'payments:full'];

// ✅ GOOD: Granular, action-specific scopes
const scopes = [
  'orders:read',              // Can look up orders
  'orders:status:read',       // Can check shipping status
  'tickets:create',           // Can create support tickets
  // NOT included:
  // 'orders:write'           // Cannot modify orders
  // 'refunds:create'         // Cannot issue refunds
  // 'customers:delete'       // Cannot delete accounts
];

// Even better: resource-specific scopes
const scopes = [
  'orders:read:user:usr_abc123',     // Only this user's orders
  'tickets:create:org:org_xyz789',    // Only this org's tickets
];
Enter fullscreen mode Exit fullscreen mode

The Tool Gateway: Intercepting Every Agent Action

The most critical security layer is one that sits between the agent and every tool it can invoke. Think of it as a firewall for agent actions.

Architecture

┌─────────────┐    ┌─────────────────────────────────────┐
│             │    │          Tool Gateway                │
│   Agent     │    │                                     │
│  Runtime    │───>│  ┌──────────┐  ┌───────────────┐   │
│             │    │  │  AuthZ    │  │  Rate Limiter │   │
│             │    │  │  Engine   │  │               │   │
└─────────────┘    │  └────┬─────┘  └───────┬───────┘   │
                   │       │                │            │
                   │  ┌────▼────────────────▼───────┐   │
                   │  │     Policy Enforcement       │   │
                   │  │     Point (PEP)              │   │
                   │  └────┬─────────────────────────┘   │
                   │       │                             │
                   │  ┌────▼─────────────────────────┐   │
                   │  │     Audit Logger              │   │
                   │  └────┬─────────────────────────┘   │
                   └───────┼─────────────────────────────┘
                           │
              ┌────────────┼────────────────┐
              │            │                │
        ┌─────▼────┐ ┌────▼─────┐  ┌──────▼──────┐
        │ Stripe   │ │ Database │  │ Slack API   │
        │ API      │ │          │  │             │
        └──────────┘ └──────────┘  └─────────────┘
Enter fullscreen mode Exit fullscreen mode

Implementation

interface ToolCallRequest {
  agentId: string;
  sessionId: string;
  toolName: string;
  parameters: Record<string, unknown>;
  reasoning: string; // Why the agent wants to call this tool
  traceId: string;
}

interface PolicyDecision {
  allowed: boolean;
  reason: string;
  requiresApproval: boolean;
  modifiedParams?: Record<string, unknown>;
}

class ToolGateway {
  private authzEngine: AuthorizationEngine;
  private rateLimiter: RateLimiter;
  private auditLog: AuditLogger;
  private approvalQueue: ApprovalQueue;

  async executeToolCall(
    request: ToolCallRequest
  ): Promise<ToolCallResult> {
    // Step 1: Verify agent identity and session
    const session = await this.verifySession(request);
    if (!session.valid) {
      throw new UnauthorizedError('Invalid or expired session');
    }

    // Step 2: Check rate limits
    const rateLimitOk = await this.rateLimiter.check(
      request.agentId,
      request.toolName
    );
    if (!rateLimitOk) {
      await this.auditLog.log({
        event: 'RATE_LIMIT_EXCEEDED',
        ...request,
      });
      throw new RateLimitError('Tool call rate limit exceeded');
    }

    // Step 3: Evaluate authorization policy
    const decision = await this.authzEngine.evaluate({
      subject: request.agentId,
      action: request.toolName,
      resource: request.parameters,
      context: {
        sessionId: request.sessionId,
        time: new Date(),
        reasoning: request.reasoning,
      },
    });

    // Step 4: Handle policy decision
    if (!decision.allowed) {
      await this.auditLog.log({
        event: 'TOOL_CALL_DENIED',
        reason: decision.reason,
        ...request,
      });
      throw new ForbiddenError(decision.reason);
    }

    // Step 5: Human approval if required
    if (decision.requiresApproval) {
      const approved = await this.requestHumanApproval(request);
      if (!approved) {
        throw new ForbiddenError('Human approval denied');
      }
    }

    // Step 6: Execute with parameter sanitization
    const sanitizedParams = decision.modifiedParams
      || this.sanitizeParams(request.parameters);

    // Step 7: Execute and audit
    const result = await this.executeWithAudit(
      request.toolName,
      sanitizedParams,
      request
    );

    return result;
  }

  private sanitizeParams(
    params: Record<string, unknown>
  ): Record<string, unknown> {
    const sanitized = { ...params };

    // Strip potential injection patterns
    for (const [key, value] of Object.entries(sanitized)) {
      if (typeof value === 'string') {
        // Remove command-like directives
        sanitized[key] = value
          .replace(/ignore previous instructions/gi, '')
          .replace(/system:/gi, '')
          .replace(/\bsudo\b/gi, '')
          .replace(/;\s*(rm|drop|delete|truncate)\b/gi, '');
      }
    }

    return sanitized;
  }
}
Enter fullscreen mode Exit fullscreen mode

MCP as the Standard Gateway

The Model Context Protocol (MCP) has become the de facto standard for agent-to-tool communication. Use MCP servers as your security boundary:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({
  name: 'secure-tools',
  version: '1.0.0',
});

// Register tools with built-in security and Zod validation
server.registerTool(
  'lookup_order',
  {
    description: 'Look up order details by order ID',
    inputSchema: z.object({
      orderId: z.string()
        .regex(/^ORD-[A-Z0-9]{8}$/)  // Strict format validation
        .describe('The order ID to look up'),
    }),
  },
  async ({ orderId }, { meta }) => {
    // Verify caller has required scope
    const hasScope = await verifyScope(
      meta.authToken,
      'orders:read'
    );
    if (!hasScope) {
      return {
        content: [
          {
            type: 'text',
            text: 'Error: Insufficient permissions for order lookup',
          },
        ],
        isError: true,
      };
    }

    // Verify the order belongs to the requesting user
    const order = await db.orders.findByIdAndUser(
      orderId,
      meta.userId
    );
    if (!order) {
      return {
        content: [
          {
            type: 'text',
            text: 'Order not found or access denied',
          },
        ],
        isError: true,
      };
    }

    // Return sanitized data — strip internal fields
    return {
      content: [
        {
          type: 'text',
          text: JSON.stringify(sanitizeOrder(order)),
        },
      ],
    };
  }
);
Enter fullscreen mode Exit fullscreen mode

Human-in-the-Loop (HITL) Approval Flows

Not every action should be gated by human approval — that defeats the purpose of automation. But high-stakes operations demand it.

The Risk Classification Matrix

enum RiskLevel {
  LOW = 'low',       // Auto-approve
  MEDIUM = 'medium', // Log and proceed, review async
  HIGH = 'high',     // Require sync human approval
  CRITICAL = 'critical', // Block entirely
}

const TOOL_RISK_CLASSIFICATION: Record<string, RiskLevel> = {
  // Low risk — auto-approve
  'lookup_order': RiskLevel.LOW,
  'check_shipping': RiskLevel.LOW,
  'search_faq': RiskLevel.LOW,

  // Medium risk — proceed but flag for review
  'create_ticket': RiskLevel.MEDIUM,
  'update_customer_preferences': RiskLevel.MEDIUM,
  'send_notification': RiskLevel.MEDIUM,

  // High risk — require real-time approval
  'issue_refund': RiskLevel.HIGH,
  'modify_subscription': RiskLevel.HIGH,
  'access_pii': RiskLevel.HIGH,
  'escalate_to_human': RiskLevel.HIGH,

  // Critical — never allow automated execution
  'delete_account': RiskLevel.CRITICAL,
  'bulk_data_export': RiskLevel.CRITICAL,
  'modify_permissions': RiskLevel.CRITICAL,
  'execute_code': RiskLevel.CRITICAL,
};
Enter fullscreen mode Exit fullscreen mode

Implementing Real-Time Approval

class ApprovalQueue {
  async requestApproval(
    request: ToolCallRequest
  ): Promise<boolean> {
    const risk = TOOL_RISK_CLASSIFICATION[request.toolName];

    if (risk === RiskLevel.CRITICAL) {
      return false; // Always deny
    }

    if (risk === RiskLevel.LOW) {
      return true; // Always allow
    }

    if (risk === RiskLevel.MEDIUM) {
      // Auto-approve but flag for async review
      await this.flagForReview(request);
      return true;
    }

    // HIGH risk: synchronous approval required
    const approval = await this.createApprovalRequest({
      toolName: request.toolName,
      parameters: request.parameters,
      reasoning: request.reasoning,
      agentId: request.agentId,
      timeout: 300000, // 5 minutes
    });

    // Notify reviewers via Slack/Teams/PagerDuty
    await this.notifyReviewers(approval);

    // Wait for response
    const result = await this.waitForDecision(
      approval.id,
      approval.timeout
    );

    // Timeout = denied (fail-closed)
    return result?.approved ?? false;
  }

  private async notifyReviewers(
    approval: ApprovalRequest
  ): Promise<void> {
    await slack.postMessage({
      channel: '#agent-approvals',
      blocks: [
        {
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: [
              `🤖 *Agent Approval Required*`,
              `*Agent:* ${approval.agentId}`,
              `*Tool:* \`${approval.toolName}\``,
              `*Parameters:* \`${JSON.stringify(approval.parameters)}\``,
              `*Reasoning:* ${approval.reasoning}`,
              `*Timeout:* 5 minutes`,
            ].join('\n'),
          },
        },
        {
          type: 'actions',
          elements: [
            {
              type: 'button',
              text: { type: 'plain_text', text: '✅ Approve' },
              action_id: `approve_${approval.id}`,
              style: 'primary',
            },
            {
              type: 'button',
              text: { type: 'plain_text', text: '❌ Deny' },
              action_id: `deny_${approval.id}`,
              style: 'danger',
            },
          ],
        },
      ],
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

Defense-in-Depth: Layered Security Architecture

No single security layer is sufficient. Production agent deployments need defense-in-depth:

Layer 1: Input Filtering (Before the Agent)

class AgentInputFilter {
  private readonly INJECTION_PATTERNS = [
    /ignore\s+(all\s+)?previous\s+instructions/i,
    /you\s+are\s+now\s+a/i,
    /system\s*:\s*/i,
    /\bact\s+as\b/i,
    /forget\s+(everything|all|your)/i,
    /new\s+instructions?\s*:/i,
    /admin\s+(mode|access|override)/i,
  ];

  async filterInput(input: string): Promise<FilterResult> {
    // Pattern matching
    for (const pattern of this.INJECTION_PATTERNS) {
      if (pattern.test(input)) {
        return {
          safe: false,
          reason: `Potential injection detected: ${pattern.source}`,
          sanitized: null,
        };
      }
    }

    // LLM-based content classification
    const classification = await this.classifyIntent(input);
    if (classification.maliciousScore > 0.7) {
      return {
        safe: false,
        reason: `Content classified as potentially malicious (score: ${classification.maliciousScore})`,
        sanitized: null,
      };
    }

    return { safe: true, reason: null, sanitized: input };
  }
}
Enter fullscreen mode Exit fullscreen mode

Layer 2: Tool Call Validation (The Gateway)

Already covered above — the Tool Gateway with AuthZ, rate limiting, and approval flows.

Layer 3: Output Filtering (After the Agent)

class AgentOutputFilter {
  async filterOutput(
    output: string,
    context: SessionContext
  ): Promise<FilterResult> {
    // PII detection and redaction
    const piiCheck = await this.detectPII(output);
    if (piiCheck.found) {
      output = this.redactPII(output, piiCheck.entities);
    }

    // Hallucination check for factual claims
    const claims = this.extractFactualClaims(output);
    for (const claim of claims) {
      const verified = await this.verifyClaim(claim, context);
      if (!verified) {
        output = this.flagUnverifiedClaim(output, claim);
      }
    }

    // Sensitive data leak prevention
    const secrets = this.detectSecretsInOutput(output);
    if (secrets.length > 0) {
      output = '[REDACTED: Output contained sensitive data]';
      await this.alertSecurityTeam({
        type: 'SECRET_LEAK_PREVENTED',
        context,
      });
    }

    return { safe: true, reason: null, sanitized: output };
  }
}
Enter fullscreen mode Exit fullscreen mode

Layer 4: Behavioral Anomaly Detection

class AgentBehaviorMonitor {
  private baselines: Map<string, BehaviorBaseline> = new Map();

  async monitorAction(
    action: AgentAction
  ): Promise<AnomalyResult> {
    const baseline = this.baselines.get(action.agentType);
    if (!baseline) return { anomalous: false };

    const anomalies: string[] = [];

    // Tool usage frequency anomaly
    const toolFreq = await this.getToolFrequency(
      action.agentId,
      action.toolName,
      '1h'
    );
    if (toolFreq > baseline.toolFrequency[action.toolName] * 3) {
      anomalies.push(
        `Tool "${action.toolName}" called ${toolFreq}x (baseline: ${baseline.toolFrequency[action.toolName]}x)`
      );
    }

    // New tool access pattern
    const previousTools = await this.getHistoricalTools(
      action.agentId,
      '30d'
    );
    if (!previousTools.includes(action.toolName)) {
      anomalies.push(
        `First-time tool access: "${action.toolName}"`
      );
    }

    // Data volume anomaly
    if (action.dataVolume > baseline.avgDataVolume * 5) {
      anomalies.push(
        `Data volume ${action.dataVolume}B (baseline: ${baseline.avgDataVolume}B)`
      );
    }

    // Token burn rate
    const tokenRate = await this.getTokenBurnRate(
      action.agentId,
      '5m'
    );
    if (tokenRate > baseline.avgTokenRate * 10) {
      anomalies.push(
        `Token burn rate ${tokenRate}/min (baseline: ${baseline.avgTokenRate}/min)`
      );
    }

    if (anomalies.length > 0) {
      await this.triggerAlert({
        agentId: action.agentId,
        anomalies,
        severity: anomalies.length > 2 ? 'critical' : 'warning',
      });
    }

    return {
      anomalous: anomalies.length > 0,
      details: anomalies,
    };
  }
}
Enter fullscreen mode Exit fullscreen mode

Audit Logging: The Forensic Backbone

Every agent action must be logged immutably with enough context to reconstruct the full decision chain:

interface AgentAuditEntry {
  // Identity
  timestamp: Date;
  traceId: string;
  agentId: string;
  agentType: string;
  sessionId: string;

  // Actor context
  triggerUserId: string | null;
  triggerSource: 'user' | 'schedule' | 'event' | 'agent';

  // Action
  event: 'TOOL_CALL' | 'TOOL_DENIED' | 'APPROVAL_REQUESTED'
    | 'APPROVAL_GRANTED' | 'APPROVAL_DENIED'
    | 'RATE_LIMITED' | 'ANOMALY_DETECTED'
    | 'SESSION_CREATED' | 'SESSION_EXPIRED';

  // Details
  toolName: string;
  parameters: Record<string, unknown>; // Sanitized
  reasoning: string;                    // Agent's reasoning
  policyDecision: PolicyDecision;
  result: 'success' | 'failure' | 'denied' | 'timeout';

  // Cost
  tokensConsumed: number;
  estimatedCost: number;

  // Metadata
  modelUsed: string;
  latencyMs: number;
}

// CRITICAL: Distinguish agent vs. human actions
class AuditLogger {
  async log(entry: AgentAuditEntry): Promise<void> {
    // Append-only, immutable store
    await this.immutableStore.append({
      ...entry,
      actorType: 'AI_AGENT', // Always mark as agent action
      hash: this.computeHash(entry), // Tamper detection
    });

    // Real-time streaming for monitoring
    await this.eventStream.publish('agent.audit', entry);
  }
}
Enter fullscreen mode Exit fullscreen mode

Production Anti-Patterns

Anti-Pattern 1: Shared API Keys

// ❌ NEVER: Multiple agents sharing one credential
const agentA = new Agent({ apiKey: SHARED_KEY });
const agentB = new Agent({ apiKey: SHARED_KEY });
// Can't distinguish agent A vs B in logs
// Revoking one key kills all agents

// ✅ ALWAYS: Per-agent, per-session credentials
const agentA = new Agent({
  credential: await issueCredential({
    agentId: 'agent-a',
    sessionId: 'sess-123',
    scopes: ['orders:read'],
    expiresIn: '15m',
  }),
});
Enter fullscreen mode Exit fullscreen mode

Anti-Pattern 2: "God Mode" Permissions

// ❌ NEVER: Agent with full database access
const agent = new Agent({
  db: new PrismaClient(), // Full schema access
  tools: ALL_TOOLS,        // Every tool available
});

// ✅ ALWAYS: Minimal, whitelist-based access
const agent = new Agent({
  db: new ReadOnlyClient({
    allowedTables: ['orders', 'products'],
    allowedOperations: ['SELECT'],
    rowLimit: 100,
  }),
  tools: SUPPORT_AGENT_TOOLS, // Curated subset
});
Enter fullscreen mode Exit fullscreen mode

Anti-Pattern 3: Trusting Agent Reasoning

// ❌ NEVER: Letting the agent decide to escalate its own permissions
if (agent.reasoning.includes('I need admin access')) {
  grantAdminAccess(agent); // The agent asked nicely!
}

// ✅ ALWAYS: Permission changes require out-of-band approval
// Permissions are defined at deployment, not at runtime
// Agents cannot request or grant themselves new permissions
Enter fullscreen mode Exit fullscreen mode

Anti-Pattern 4: No Emergency Kill Switch

// ❌ NEVER: No way to stop a compromised agent
agent.run(); // And we pray

// ✅ ALWAYS: Circuit breaker + kill switch
const controller = new AbortController();
const breaker = new CircuitBreaker({
  maxFailures: 5,
  maxCost: 100, // $100 max
  timeout: 60000,
  signal: controller.signal,
});

// External kill switch
adminApi.on('kill-agent', (agentId) => {
  if (agentId === agent.id) {
    controller.abort('Emergency shutdown by admin');
    revokeAllCredentials(agent.id);
    notifySecurityTeam(agent.id, 'EMERGENCY_KILL');
  }
});
Enter fullscreen mode Exit fullscreen mode

The Security Checklist

Before deploying any AI agent to production:

Identity & Authentication:

  • [ ] Agent has a unique identity (not shared credentials)
  • [ ] Credentials are short-lived (minutes/hours, not days/months)
  • [ ] OAuth 2.1 with PKCE for delegated user authorization
  • [ ] No static API keys in agent configuration
  • [ ] Credential rotation is automated

Authorization & Permissions:

  • [ ] Tool access is whitelist-based (explicit allow, default deny)
  • [ ] Scopes are granular and action-specific
  • [ ] Dynamic authorization considers context (time, risk, user tier)
  • [ ] Agents cannot self-escalate permissions
  • [ ] High-risk operations require human-in-the-loop approval

Tool Gateway:

  • [ ] All tool calls pass through a centralized gateway
  • [ ] Input parameters are validated and sanitized
  • [ ] Rate limiting is enforced per agent, per tool
  • [ ] MCP or equivalent protocol standardizes tool communication
  • [ ] Output is filtered for PII and sensitive data

Monitoring & Response:

  • [ ] Every agent action is logged immutably
  • [ ] Agent actions are distinguishable from human actions in logs
  • [ ] Behavioral anomaly detection is active
  • [ ] Emergency kill switch exists for every agent
  • [ ] Incident response playbook includes agent compromise scenarios

AI agents are the most powerful — and the most dangerous — software pattern of 2026. They make autonomous decisions with real credentials against real systems. If you treat agent security like traditional API security, you're building on a foundation of false assumptions. Agents need dynamic authorization, scoped credentials, gateway-enforced tool access, human oversight for high-stakes actions, and continuous behavioral monitoring. The patterns in this guide were designed for the reality that agents are non-deterministic, manipulable, and capable of causing real damage when their permissions exceed their trustworthiness. Secure the execution layer, and your agents become a force multiplier. Ignore it, and you're one prompt injection away from your worst incident.


Speed Tip: Read the original post on the Pockit Blog.

Tired of slow cloud tools? Pockit.tools runs entirely in your browser. Get the Extension now for instant, zero-latency access to essential dev tools.

Top comments (0)