Sami Marreed

Posted on Apr 29

Introducing CUGA Policies: The Runtime Governance Layer for AI Agents

#mcp #ai #agents

Co-authored by CUGA Team: Asaf Adi, Offer Akrabi, Ido Levy, Nir Mashkif, Alon Oved, Segev Shlomov, Harold Ship, Iftach Shoham, Noa Wegerhoff, Avi Yaeli, Sergey Zeltyn

The Problem: Agents Without Guardrails
What Is the Policy System?
Architecture Overview
The Five Policy Types
Trigger System
Matching Algorithm
Enforcement & Action Types
Storage & Vector Search
Filesystem Sync (.cuga folder)
SDK API
Configuration Reference
Real-World Recipes
Conclusion
Try it, star us, give feedback

1 · The Problem: Agents Without Guardrails

Deploying an AI agent to production is not the finish line — it is the starting gun. The moment real users get their hands on an autonomous agent, you discover a humbling set of edge cases that no amount of system-prompt engineering fully prevents:

Users ask the agent to do things that are legally or operationally off-limits.
The agent is helpful in ways that are too helpful — deleting records, sending emails, charging cards without confirmation.
Different tenants or deployment contexts require different behavior from the same codebase.
Regulatory requirements (GDPR, HIPAA, SOC 2) demand auditable, deterministic enforcement of certain rules.
The reliability problem: without guardrails, agents behave unpredictably, which prevents teams from deploying them to production with confidence.

The traditional answer is to litter your agent code with if-checks, hard-coded system prompt fragments, and custom pre-/post-processing logic. The result: a tightly coupled, brittle codebase where business rules are indistinguishable from core logic.

"Policies give you a declarative, version-controllable, runtime-configurable governance layer — completely decoupled from the agent's reasoning engine."

CUGA's policy system solves this cleanly. Policies are first-class runtime objects that intercept the agent's execution graph at well-defined points, without requiring you to change a single line of agent code. They are stored in a database, matched semantically against conversation context, and enforced deterministically.

2 · What Is the Policy System?

At its core, a CUGA policy is a structured rule with three parts:

When to activate — zero or more triggers that pattern-match against conversation context: keywords, semantic embeddings, active apps, available tools, or graph state.
What to do — an action type that defines how the agent's behavior is modified (block, guide, filter, gate, reformat).
Payload — type-specific content: markdown guidance, a canned response, tool descriptions to inject, an approval message, or an output format schema.

Policies are stored persistently (SQLite locally, PostgreSQL in production), complete with vector embeddings for semantic matching. They can also be defined as markdown files in a .cuga/ folder in your repository, enabling full version-control workflows.

Key insight — The policy system is entirely orthogonal to your agent's reasoning logic. You can add, remove, or modify policies at runtime without redeploying your agent.

3 · Architecture Overview

The policy system is composed of four distinct layers, each with a single, well-defined responsibility:

Figure 1 — Four-layer architecture of the CUGA policy system

What makes this design powerful is the PolicyConfigurable singleton pattern: a single shared instance is injected into the LangGraph execution context and made available to every node in the graph. Nodes that need policy enforcement simply call into PolicyEnactment — they do not own any policy logic themselves.

Figure 2 — Policy enforcement sequence during graph execution

4 · The Five Policy Types

These five types emerged from real-world engagements where we had to make CUGA reliable and consistent enough for production. Each addresses a specific governance need we encountered when teams tried to ship agents at scale.

Type	Role
Playbook	Step-by-step guidance injection — shapes how the agent responds without blocking it
Intent Guard	Hard stop on forbidden intents — the agent never processes the request
Tool Guide	Enriched tool descriptions — appends contextual instructions to tool descriptions at runtime
Tool Approval	Human-in-the-loop gating — intercepts generated code before tool execution
Output Formatter	Response reshaping — post-processes the agent's final answer through an LLM call

4.1 · Playbooks

A Playbook is the gentlest form of policy enforcement. Rather than blocking or redirecting the user, it shapes the agent's behavior by injecting a structured guide into the system prompt when a relevant topic is detected.

Playbooks consist of free-form markdown_content and an optional ordered list of steps, each with a title, description, and optional tool hints. When policy.playbook_refine is enabled, the injected playbook is also refined by an LLM pass that trims irrelevant steps based on how far the conversation has progressed.

# As a markdown file in .cuga/playbooks/checkout.md
---
id: playbook_checkout
name: E-Commerce Checkout Guide
description: Step-by-step guide for checkout flow
type: playbook
priority: 60
enabled: true
triggers:
  keywords: [checkout, buy, purchase, cart]
  target: intent
  operator: or
---

## Checkout Playbook

### Steps
1. Verify cart is not empty before proceeding
2. Validate shipping address with the address verification tool
3. Present available payment methods
4. Confirm order summary before charging
5. Generate and send order confirmation email

# Or via the SDK
await agent.policies.add_playbook(
    name="E-Commerce Checkout Guide",
    content="""
## Checkout Playbook
1. Verify cart is not empty
2. Validate shipping address
3. Confirm payment method
4. Send confirmation email
""",
    keywords=["checkout", "buy", "purchase"],
    target="intent",
    priority=60,
)

4.2 · Intent Guards

Intent Guards are the hardest policy type. When triggered, they immediately issue a Command(goto=END) to the LangGraph runtime, bypassing all downstream nodes. The agent never reasons about the request — the guard response is returned directly to the user.

Guards support three response formats:

Response Type	Description	Use Case
`natural_language`	Plain text explanation returned to the user	Friendly declination of off-topic requests
`json`	Structured JSON payload with a `status_code`	API-facing agents; programmatic error handling
`template`	Jinja-style template with context interpolation	Branded, dynamic decline messages

await agent.policies.add_intent_guard(
    name="No Competitor Mentions",
    keywords=["competitor_a", "competitor_b"],
    target="user_input",
    response="I'm not able to discuss third-party products. How can I help you with our platform?",
    allow_override=False,
    priority=90,
)

Important — Intent Guards with allow_override=False cannot be bypassed even by subsequent policies with higher priority. Use this for hard regulatory requirements.

4.3 · Tool Guides

Tool Guides enhance the agent's understanding of how to use specific tools in a given context. Rather than changing the tool's implementation, they append or prepend markdown instructions to the tool's description at runtime, giving the LLM richer context for that specific interaction.

A single Tool Guide can target specific tools by name or use ["*"] to enrich all available tools. Multiple Tool Guides can match simultaneously — their results are merged into a list and applied together.

await agent.policies.add_tool_guide(
    name="Payment Tool Safety Instructions",
    content="""
⚠️ ALWAYS confirm the exact charge amount with the user before calling this tool.
⚠️ Never retry on a timeout — contact support instead.
⚠️ Log all call attempts regardless of outcome.
""",
    target_tools=["charge_card", "process_refund"],
    keywords=["payment", "charge", "refund"],
    prepend=True,
    priority=80,
)

4.4 · Tool Approvals

Tool Approvals implement a human-in-the-loop checkpoint. Unlike other policy types, approval policies are checked after code generation — they inspect the actual generated code text to determine whether execution should be gated, not the user's intent.

This is a critical distinction: a user might innocuously say "send a summary to the team" — but if the generated code calls a send_bulk_email tool to a list of 50,000 addresses, the approval policy catches that before execution.

await agent.policies.add_tool_approval(
    name="Bulk Email Approval",
    required_tools=["send_bulk_email"],
    approval_message="This action will send emails to multiple recipients. Please review and confirm.",
    show_code_preview=True,
    auto_approve_after=300,  # seconds; None = wait indefinitely
    priority=70,
)

4.5 · Output Formatters

Output Formatters are the final policy type, acting as a post-processing pass on the agent's completed response. They trigger based on the content of agent_response and call an LLM to reformat it according to a specified format configuration.

Format Type	Description
`markdown`	Restructure the response as formatted markdown with headers, lists, and emphasis
`json_schema`	Extract and return a structured JSON object matching a provided schema
`direct`	Replace the response verbatim with a template string

await agent.policies.add_output_formatter(
    name="JSON API Response Format",
    format_type="json_schema",
    format_config={
        "schema": {
            "type": "object",
            "properties": {
                "summary": {"type": "string"},
                "action_taken": {"type": "string"},
                "next_steps": {"type": "array", "items": {"type": "string"}},
            },
        }
    },
    priority=40,
)

5 · Trigger System

Every policy has a triggers list. Each trigger in the list must pass for the policy to activate — triggers are combined with AND logic across the list. Within a single keyword trigger, individual keywords use the configured operator (and or or).

Figure 3 — Trigger types and the target fields they can examine

Trigger Type	Matching Strategy	Speed	Best For
`KeywordTrigger`	Case-insensitive substring in target field	⚡ Instant	Deterministic brand/compliance keywords
`NaturalLanguageTrigger`	Embedding cosine similarity + LLM conflict resolution	🐢 ~200–500ms	Semantic intent matching, paraphrase handling
`AppTrigger`	Exact match against `context.active_apps`	⚡ Instant	App-specific rules (only apply in Salesforce, etc.)
`ToolTrigger`	Exact match against `context.available_tools`	⚡ Instant	Tool-scoped guides (activate when certain tools present)
`StateTrigger`	State key inspection: `equals` / `contains` / `regex`	⚡ Instant	Workflow-stage–specific rules
`AlwaysTrigger`	Always matches (no evaluation)	⚡ Instant	Universal tool guides, global output formatting

Performance Tip — Combine a fast KeywordTrigger with a NaturalLanguageTrigger in the same triggers list when you need both precision for known exact phrases and semantic coverage for paraphrases. The keyword check gates the expensive NL check.

6 · Matching Algorithm

The PolicyAgent.match_policy(context) method orchestrates a two-phase matching pipeline designed to balance determinism with semantic flexibility:

Figure 4 — Two-phase policy matching pipeline

Priority Tiebreaking — When multiple policies of the same type match at the same confidence level, priority is the tiebreaker (higher number = higher priority). Intent Guards always outrank Playbooks regardless of priority, because blocking an intent is fundamentally more important than guiding it.

Confidence Threshold — Each NaturalLanguageTrigger carries a threshold field (default 0.7). The LLM conflict resolution step returns a confidence score for its chosen policy. If confidence < threshold, the match is discarded — preventing low-confidence false positives from firing policies.

Multi-Match for Tool Guides — Unlike Playbooks and Guards, Tool Guide matching does not stop at the first match. check_tool_guide_policies() returns all matching guides, which are then merged and applied together. This allows you to have a generic "all tools" safety guide alongside specific per-tool instructions — both activate simultaneously.

7 · Enforcement & Action Types

PolicyEnactment.check_and_enact() is the central dispatch method. It receives a PolicyMatch and returns a (command, metadata) tuple that the calling graph node interprets.

Action Type	Effect on the Graph	Returned By
`BLOCK_INTENT`	Issues `Command(goto=END)`; `final_answer` is set to the guard response. Graph terminates immediately.	IntentGuard
`GUIDE_PROMPT`	Returns metadata containing `playbook_guidance`. Node calls `inject_playbook_into_prompt()` to stitch it into the system prompt.	Playbook
`TOOL_INJECT_DESCRIPTION`	Returns metadata with guide content and target tools. Node calls `apply_tool_guide()`, which deep-copies and mutates tool objects.	ToolGuide
`TOOL_REQUIRE_APPROVAL`	Returns metadata; the `tool_approval_handler` node checks generated code and pauses for human confirmation before execution.	ToolApproval
`FORMAT_OUTPUT`	Calls an LLM with the final answer and format spec. Replaces `state.final_answer` in-place.	OutputFormatter
`INJECT_CONTEXT`	Merges extra key-value pairs into the graph state.	CustomPolicy
`LOG_ONLY`	Records the match in logs/audit trail without changing behavior. Useful for shadow-mode testing.	Any policy type

Figure 5 — State machine of policy enforcement through the agent lifecycle

8 · Storage & Vector Search

The storage layer is designed around a pluggable PolicyStoreBackend protocol, with two concrete implementations:

Backend	Database	Vector Extension	Environment
`LocalPolicyStore`	SQLite	sqlite-vec	Local development, single-tenant
`ProdPolicyStore`	PostgreSQL	pgvector	Production, multi-tenant

Each policy is stored with the following columns:

-- Simplified schema
CREATE TABLE policies (
    id              TEXT PRIMARY KEY,
    policy_type     TEXT,          -- playbook | intent_guard | tool_guide | ...
    policy_json     TEXT,          -- full serialized policy object
    enabled         BOOLEAN,
    priority        INTEGER,
    tenant_id       TEXT,
    instance_id     TEXT,
    embedding       VECTOR(1536)   -- auto-generated at write time
);

Embedding Generation — Embeddings are generated automatically at write time from a concatenation of the policy's description, all natural language trigger values, and type-specific content (playbook markdown, tool names, etc.). This means a semantic search over policies effectively queries their intent and content, not just their metadata.

Multi-Tenancy — The tenant_id and instance_id columns enable hard isolation of policy sets across tenants and deployment instances. All queries are automatically scoped to the current tenant/instance from the LangGraph config context.

9 · Filesystem Sync — The .cuga Folder

One of the most developer-friendly features of the policy system is the .cuga folder. When policy.filesystem_sync is enabled, the policy system maintains a bidirectional sync between your database and a set of markdown files on disk.

.cuga/
├── playbooks/
│   ├── checkout_guide.md
│   └── onboarding_flow.md
├── intent_guards/
│   ├── no_competitors.md
│   └── gdpr_block.md
├── tool_guides/
│   └── payment_safety.md
├── tool_approvals/
│   └── bulk_email_approval.md
└── output_formatters/
    └── api_json_format.md

Each file uses YAML frontmatter for metadata and markdown body for content. The PolicyFilesystemSync class handles:

Upsert on load — reads all files and syncs them into the database, updating existing policies by ID.
Removal sync — sync_removals() deletes from the database any policies whose files were deleted from disk.
Version control friendly — all policies are reviewable, diffable, and auditable in your git history.
CI/CD integration — policies can be loaded in your deployment pipeline before the agent starts.

# Load all policies from your .cuga folder
await agent.policies.load_from_folder(".cuga")

# Or from a JSON export
await agent.policies.load_from_json("policies-backup.json", clear_existing=False)

10 · SDK API

The PoliciesManager is exposed as agent.policies on any CugaAgent instance. It provides a clean async API for all CRUD operations:

from cuga import CugaAgent

agent = CugaAgent(...)

# ── CREATE ──────────────────────────────────────────────────────
await agent.policies.add_playbook(
    name="Support Escalation Guide",
    content="## Steps\n1. Gather issue details\n2. Check KB\n3. Escalate if needed",
    keywords=["escalate", "supervisor", "manager"],
    natural_language_trigger="user wants to speak to a human or escalate the issue",
    priority=70,
)

await agent.policies.add_intent_guard(
    name="PII Guard",
    keywords=["ssn", "social security", "credit card number"],
    target="user_input",
    response="I cannot process requests involving sensitive personal information.",
    priority=100,
)

await agent.policies.add_tool_guide(
    name="Database Safety Guide",
    content="Never run DELETE without a WHERE clause. Always prefer soft deletes.",
    target_tools=["execute_sql"],
    prepend=True,
)

await agent.policies.add_tool_approval(
    name="Production DB Approval",
    required_tools=["execute_sql"],
    required_apps=["production_db"],
    approval_message="Confirm SQL execution on PRODUCTION database.",
    show_code_preview=True,
)

await agent.policies.add_output_formatter(
    name="Markdown Formatter",
    format_type="markdown",
    format_config={"style": "structured with headers and bullet points"},
)

# ── READ ────────────────────────────────────────────────────────
policies = await agent.policies.list()
policy   = await agent.policies.get("policy_id_here")

# ── DELETE ──────────────────────────────────────────────────────
await agent.policies.delete("policy_id_here")
await agent.policies.clear()  # remove ALL policies

# ── BULK OPERATIONS ─────────────────────────────────────────────
await agent.policies.load_from_folder(".cuga")
await agent.policies.load_from_json("policies.json", clear_existing=False)

Working Example

A complete runnable example that creates an agent with tools, adds an Intent Guard and Playbook, then invokes it:

from cuga import CugaAgent
from langchain_core.tools import tool
import asyncio

@tool
def add_numbers(a: int, b: int) -> int:
    '''Add two numbers together'''
    return a + b

@tool
def multiply_numbers(a: int, b: int) -> int:
    '''Multiply two numbers together'''
    return a * b

agent = CugaAgent(tools=[add_numbers, multiply_numbers])


async def main():
    await agent.policies.add_intent_guard(
        name="Block Delete Operations",
        description="Prevents deletion of critical data",
        keywords=["delete", "remove", "erase"],
        response="Deletion operations are not permitted for security reasons.",
        priority=100
    )

    await agent.policies.add_playbook(
        name="Budget Analysis Workflow",
        description="Multi-step process for analyzing financial budgets",
        natural_language_trigger=["When user asks to analyze their budget"],
        content="""# Budget Analysis Workflow

## Step 1: Calculate Total Expenses
- Sum all expense categories using add_numbers
- Document each category amount

## Step 2: Calculate Total Revenue
- Sum all revenue streams using add_numbers
- Include all income sources

## Step 3: Calculate Profit Margin
- Use multiply_numbers to calculate profit (revenue - expenses)
- Calculate margin percentage

## Step 4: Generate Recommendations
- Compare against target budget
- Identify areas for optimization
- Provide actionable insights""",
        priority=50
    )

    result = await agent.invoke("Analyze my budget: expenses are 5000 and 3000, revenue is 12000")
    print(result.answer)

if __name__ == "__main__":
    asyncio.run(main())

11 · Configuration Reference

Setting	Type	Default	Description
`policy.enabled`	bool	`True`	Global kill switch. If `False`, all policy checks are silently skipped.
`policy.policy_db_path`	str	`"policies.db"`	Path to the local SQLite database file (relative to `DBS_DIR`).
`policy.collection_name`	str	`"policies"`	Vector store collection name for policy embeddings.
`policy.cuga_folder`	str	`".cuga"`	Default folder for filesystem sync.
`policy.filesystem_sync`	bool	`False`	Enable auto-sync between `.cuga/` folder and database on startup.
`policy.playbook_refine`	bool	`False`	If `True`, an LLM pass trims irrelevant playbook steps based on conversation progress.
`storage.embedding.provider`	str	—	Embedding provider (e.g., `"openai"`, `"anthropic"`).
`storage.embedding.model`	str	—	Embedding model name (e.g., `"text-embedding-3-small"`).

12 · Real-World Recipes

Recipe 1: GDPR & Regulatory Compliance

Block any request that involves personally identifiable information in regulated contexts, and log all attempts for your audit trail.

# Hard block on PII requests
await agent.policies.add_intent_guard(
    name="GDPR PII Block",
    keywords=["passport", "ssn", "date of birth", "address"],
    natural_language_trigger="user is asking for or wants to share personal identifying information",
    target="user_input",
    response='{"error": "PII_REQUEST_BLOCKED", "code": 403, "message": "This request cannot be processed under GDPR Article 9."}',
    allow_override=False,
    priority=100,
)

# Shadow log NL-adjacent queries without blocking (for audit)
await agent.policies.add_intent_guard(
    name="PII Audit Log",
    natural_language_trigger="user discusses personal data or privacy",
    response="",
    action_override="LOG_ONLY",
    priority=50,
)

Recipe 2: SRE / Production Safety

Require human approval before any SQL execution touches a production database, with a 5-minute auto-reject timeout.

# Require approval for production SQL
await agent.policies.add_tool_approval(
    name="Production SQL Approval",
    required_tools=["execute_sql", "run_migration"],
    required_apps=["production_db", "prod_rds"],
    approval_message="⚠️ This will execute SQL on PRODUCTION. Review the query carefully.",
    show_code_preview=True,
    auto_approve_after=None,  # wait indefinitely
    priority=95,
)

# Inject safety guidelines into all SQL tool descriptions
await agent.policies.add_tool_guide(
    name="SQL Safety Rules",
    content="""
🔴 PRODUCTION SAFETY RULES:
- Never use DELETE without a WHERE clause
- Always prefer UPDATE over DELETE for soft deletes
- Wrap schema changes in transactions
- Validate row counts before bulk operations
""",
    target_tools=["execute_sql", "run_migration"],
    prepend=True,
    priority=90,
)

Recipe 3: Structured API Responses

Force all agent responses through a JSON schema formatter when the agent is used as a backend API.

await agent.policies.add_output_formatter(
    name="API JSON Response",
    format_type="json_schema",
    format_config={
        "schema": {
            "type": "object",
            "required": ["status", "message", "data"],
            "properties": {
                "status":  {"type": "string", "enum": ["success", "error", "partial"]},
                "message": {"type": "string"},
                "data":    {"type": "object"},
                "actions": {"type": "array", "items": {"type": "string"}},
            },
        }
    },
    triggers=[{"type": "always"}],  # apply to all responses
    priority=10,
)

Recipe 4: Policy-as-Code with .cuga Folder

Define all policies as versioned markdown files in your repository and load them at startup.

# .cuga/intent_guards/no_competitors.md
---
id: guard_no_competitors
name: No Competitor Mentions
description: Prevents discussion of competitor products
type: intent_guard
priority: 85
enabled: true
triggers:
  keywords: [competitor_x, competitor_y, other_product]
  target: user_input
  operator: or
allow_override: false
---

I'm only able to discuss our own products and services.
Is there something I can help you with on our platform?

# In your agent startup code
async def startup(agent: CugaAgent):
    await agent.policies.load_from_folder(".cuga")
    print(f"Loaded {len(await agent.policies.list())} policies")

# In CI/CD: load, validate, deploy
# cuga-policy load-from-file policies-staging.json
# cuga-policy validate --all
# cuga-policy export-to-file policies-prod.json

13 · Conclusion

The CUGA policy system represents a fundamental shift in how we think about AI agent governance. Instead of embedding business rules, compliance requirements, and behavioral constraints directly into agent code — where they become tightly coupled, hard to audit, and impossible to change without a redeploy — policies externalize all of that into a first-class, runtime-configurable layer.

The five policy types cover the full spectrum of governance needs:

Playbook — guide the agent's reasoning without constraining it
Intent Guard — hard stops for forbidden territory
Tool Guide — contextual enrichment for smarter tool use
Tool Approval — human-in-the-loop for high-stakes actions
Output Formatter — consistent, structured outputs at scale

The trigger system — combining deterministic keyword matching with semantic embedding search and LLM conflict resolution — gives you the precision of rule-based systems with the flexibility of natural language understanding. And the storage layer, with its pluggable backends and automatic embedding generation, ensures that policy matching stays fast and accurate as your policy library grows.

Perhaps most importantly, the .cuga folder and CLI tooling make policies a first-class engineering artifact: version-controlled, reviewable in PRs, testable in CI, and deployable independently of your agent code. This is what production-grade AI governance looks like.

The goal is not to constrain what your AI agent can do — it's to ensure that what it does is always intentional, auditable, and aligned with the rules of your business.

Whether you are deploying a customer-facing support agent, an internal SRE copilot, or a multi-tenant enterprise product, the CUGA policy system gives you the tools to run AI agents with confidence.

Try it, star us, give feedback

We'd love for you to try CUGA Policies and tell us what works — and what doesn't. If you find it useful, star us on GitHub and share your feedback so we can keep improving. Policies are a key part of our mission to make AI agents safe and reliable enough for real production use.

Try policies in your agent — Quick Start: add a .cuga/ folder, define a few guards or playbooks, and see how behavior changes without touching your code.

Star us on GitHub — it helps others discover CUGA and lets us know you value what we're building.

Give us feedback — open an issue, join our Discord, or reach out. We read every piece of feedback and use it to prioritize what to build next.

CUGA Team

DEV Community