DEV Community

Alister Baroi for Tigera Inc

Posted on • Originally published at tigera.io on

How to Stub LLMs for AI Agent Security Testing and Governance

_ Note: The core architecture for this pattern was introduced by Isaac Hawley from Tigera._

If you are building an AI agent that relies on tool calling, complex routing, or the Model Context Protocol (MCP), you’re not just building a chatbot anymore. You are building an autonomous system with access to your internal APIs.

With that power comes a massive security and governance headache, and AI agent security testing is where most teams hit a wall. How do you definitively prove that your agent’s identity and access management (IAM) actually works?

The scale of the problem is hard to overstate. Microsoft’s telemetry shows that 80% of Fortune 500 companies now run active AI agents, yet only 47% have implemented specific AI security controls. Most teams are deploying agents faster than they can test them.

If an agent is hijacked via prompt injection, or simply hallucinates a destructive action, does your governance layer stop it? Testing this usually forces engineers into a frustrating trade-off:

  1. Use the real API (Gemini, OpenAI): Real models are heavily RLHF’d to be safe and polite. It is incredibly difficult (and non-deterministic) to intentionally force a real model to “go rogue” and consistently output malicious tool calls so you can test your security boundaries.
  2. Mock the internal tools only: You test your Python or Go functions in isolation, but you never actually test the “Agent Loop”—meaning you aren’t testing if the harness correctly applies the user’s OAuth tokens or Role-Based Access Control (RBAC) to the LLM’s requested tool call.

Recently, Isaac Hawley introduced a much better pattern: The Stub Model —a way to stub your LLM for testing that makes your security assertions completely deterministic.

A Stub Model (or mock LLM) is a deterministic, non-AI replacement for a real language model that you inject into your agent harness during testing. It returns hardcoded tool-call requests — including deliberately malicious ones — so you can prove that your security layer correctly intercepts and blocks unauthorized actions without relying on a live model API.

The Core Concept: A “Malicious” Router for AI Agent Security Testing

Instead of hitting a real model API during tests, we inject a StubLLM that implements our system’s core LLM interface.

The stub doesn’t use any AI. Instead, it parses incoming prompts for specific testing triggers and returns hardcoded, completely predictable tool calls. Crucially, this forces your agent harness to actually execute the real underlying tool pipeline. You aren’t just faking a final text response; you are making the LLM trigger your application’s real execution loop.

From a governance perspective, this is a superpower. You can program the stub to request highly privileged actions (like drop_database orread_all_users), and then write strict, lightning-fast assertions to prove that your Agent Harness intercepts the call, checks the executing user’s identity, and blocks the action.

Here is how you can implement and test this security pattern in both Python and Go.

Python: Proving RBAC & Tool Governance

In Python, we use a Protocol to define our LLM dependency, and then build a Stub that intentionally requests unauthorized actions.

from typing import List, Optional, Protocol
from pydantic import BaseModel
# Define standard tool call response formats
class ToolCall(BaseModel):
   id: str
   name: str
   arguments: dict
class Response(BaseModel):
   content: Optional[str] = None
   tool_calls: Optional[List[ToolCall]] = None
# Define the LLM Interface
class LLMClient(Protocol):
   def generate(self, prompt: str) -> Response:
       ...
# Implement the Stub Model for Security Testing
class StubLLM:
   def generate(self, prompt: str) -> Response:
       # 1. Standard authorized action
       if "MOCK_WEATHER_TOOL" in prompt:
           return Response(
               tool_calls=[ToolCall(id="call_1", name="get_weather", arguments={"location": "London"})]
           )

       # 2. Malicious / Unauthorized action for Governance testing
       if "MOCK_UNAUTHORIZED_DELETE" in prompt:
            return Response(
               tool_calls=[
                   ToolCall(
                       id="call_malicious_999",
                       name="delete_user_account",
                       arguments={"user_id": "admin_01"} # The LLM is trying something dangerous!
                   )
               ]
           )
       return Response(content="This is a stubbed standard response.")
Enter fullscreen mode Exit fullscreen mode

The Security Unit Test (pytest): With the stub in place, we can test that our Agent correctly parses the dangerous tool call, evaluates the user’s identity, and blocks the execution of the real local Python function.

import pytest
def test_agent_rbac_blocks_unauthorized_tool_execution():
# Arrange: Inject our deterministic stub into the Agent
stubbed_llm = StubLLM()
# Initialize our agent harness with a heavily restricted "guest" identity
agent = Agent(llm_client=stubbed_llm, user_role="guest_user")
# Act: Send the trigger that forces our stub to attempt a destructive tool call
response = agent.run("Please MOCK_UNAUTHORIZED_DELETE")
# Assert: Verify the Agent's governance harness intercepted the call,
# checked the "guest_user" identity, and blocked the REAL local tool.
assert response.status == "blocked_by_policy"
assert response.tool_executed is None
assert "Insufficient permissions to execute delete_user_account" in response.error_message
Enter fullscreen mode Exit fullscreen mode

Go: Validating OAuth & Identity Boundaries

In Go, this pattern shines for validating complex OAuth scopes or identity propagation in multi-agent networks.

package llm
import (
   "encoding/json"
   "strings"
)
type ToolCall struct {
   ID string `json:"id"`
   Name string `json:"name"`
   Arguments []byte `json:"arguments"`
}
type Response struct {
   Content string `json:"content,omitempty"`
   ToolCalls []ToolCall `json:"tool_calls,omitempty"`
}
type Client interface {
   Generate(prompt string) (*Response, error)
}
type StubLLM struct{}
func NewStubLLM() *StubLLM {
   return &StubLLM{}
}
func (s *StubLLM) Generate(prompt string) (*Response, error) {
   // Simulate an Agent trying to access a secure internal system via MCP
   if strings.Contains(prompt, "MOCK_ACCESS_SECURE_VAULT") {
       args, _ := json.Marshal(map[string]string{"secret_id": "prod_db_password"})

       return &Response{
           ToolCalls: []ToolCall{
               {
                   ID: "call_vault_123",
                   Name: "read_secure_vault",
                   Arguments: args,
               },
           },
       }, nil
   }
   return &Response{Content: "Standard response"}, nil
}
Enter fullscreen mode Exit fullscreen mode

The Security Unit Test (testing): We write a test to guarantee that if the LLM decides to hit the vault, the Agent harness forces the underlying tool to respect the provided OAuth context.

package agent_test
import (
"testing"
"errors"
)
func TestAgentEnforcesOAuthScopes(t *testing.T) {
// Arrange: Initialize the agent with the Stub model
stub := llm.NewStubLLM()
// Create an agent context with a standard user OAuth token (No Vault Access)
mockOAuthContext := identity.NewContext(identity.WithScope("read:public"))
myAgent := agent.New(stub, mockOAuthContext)
// Act: Trigger the LLM to request a highly privileged tool call
result, err := myAgent.Run("I need you to MOCK_ACCESS_SECURE_VAULT")
// Assert: Verify the harness evaluated the tool against the OAuth scope and blocked it
if err == nil {
t.Fatalf("CRITICAL SECURITY FAILURE: Agent executed secure vault tool without proper OAuth scope")
}
if !errors.Is(err, ErrUnauthorizedToolExecution) {
t.Errorf("Expected authorization error, got: %v", err)
}
if result.ExecutedTool == "read_secure_vault" {
t.Errorf("The real tool was executed despite lack of permissions!")
}
}
Enter fullscreen mode Exit fullscreen mode

Why Security & Governance Teams Love This Architecture

By treating the LLM like any other untrusted external dependency, we achieve total control over our agent’s testing environment.

  • Auditable Proof of Governance: You now have concrete CI/CD tests proving that your agent respects OAuth scopes, RBAC, and identity guardrails. You aren’t just hoping the model behaves; you are proving the harness defends against it when it doesn’t.
  • Tests the Real Agent Harness: Because the LLM returns a perfectly formatted tool call request, your application code actually executes its real security middleware. You validate the entire execution loop, not just a mocked final answer.
  • Lightning Fast & Free: You can run thousands of these security edge-case tests in milliseconds without spending a dime on API tokens or exposing secrets in your CI pipeline.
  • Force Prompt Injection Scenarios: You can easily stub the LLM to return tool arguments containing SQL injection or XSS payloads to ensure your local tools sanitize inputs provided by the AI.

The Trade-Offs: What the Stub Model DOESN’T Test

As powerful as this architecture is for testing your infrastructure, it’s important to acknowledge that it is not a silver bullet. There are two major things the Stub Model cannot test:

  1. It tests the pipes, not the brain: The stub proves your system can correctly block a malicious tool call, but it does not test whether your system prompt is resilient to prompt injection in the first place. You still need LLM-as-a-judge pipelines and continuous evaluation frameworks to test your model’s actual reasoning capabilities.
  2. Vendor Schema Drift: If OpenAI, Anthropic, or Google update the shape of their underlying JSON tool-call schema, your hardcoded stub tests will still pass with flying colors while your production environment crashes. You still need a handful of real, end-to-end (E2E) smoke tests running against the live API on a nightly basis to catch vendor drift.

Beyond the Chatbot: Engineering for Agency

If you are building complex systems, delegating between autonomous agents, or integrating internal APIs via MCP, you cannot afford to have untested authorization loops.

By treating the LLM like any other untrusted external dependency, we achieve total control over our agent’s testing environment. We gain auditable proof of governance , ensuring we can run thousands of CI/CD tests in milliseconds without exposing secrets or spending a dime on API tokens.

If you are building complex systems, delegating between autonomous agents, or integrating internal APIs via MCP, you cannot afford to have untested authorization loops.

Do yourself a favor: Stub your LLMs.

Stubbing your LLM proves the guardrails work in test. TAG enforces them in production, giving you continuous visibility into every agent action, authorization decision, and policy enforcement event across your entire organization. Talk to us about TAG.

The post How to Stub LLMs for AI Agent Security Testing and Governance appeared first on Tigera - Creator of Calico.

Top comments (0)