Nilofer 🚀

Posted on May 16

Agent Constitution: Policy Enforcement and PII Protection for AI Agents

#agents #security #machinelearning #opensource

AI agents are getting more capable. They can browse the web, call APIs, read and write files, and execute code. That capability is exactly what makes them useful and exactly what makes them dangerous without guardrails.

Most agent safety approaches rely on prompt instructions. Tell the model not to delete files. Tell it not to send requests to untrusted URLs. Tell it not to leak PII. But instructions in a prompt are not enforceable — a sufficiently complex agent workflow, a jailbreak attempt, or just an edge case in reasoning can bypass them silently.

Agent Constitution is a policy enforcement framework for AI agents that enforces behavioral rules at the code level, not the prompt level. You define rules in a YAML constitution file, wrap your agent's tool calls with the enforcer, and get PII detection, audit logging, and a real-time dashboard all without modifying your agent's core logic.

Features

Policy-Based Enforcement - Define rules using YAML constitution files
AST-Based Expression Evaluation - Safe condition evaluation without code injection risks
PII Detection - Regex and Ollama-powered detection of sensitive information
Audit Logging - JSONL-based audit trail with rotation support
Real-Time Dashboard - FastAPI + WebSocket + React dashboard for monitoring
CLI Interface - Rich command-line interface for management

How It Works

The core concept is a constitution - a YAML file that defines policies, and within each policy, rules. Each rule has a condition written as a plain expression, an action (block or notify), and a severity level. The enforcer evaluates these conditions against every tool call before it executes.

The condition evaluation uses AST-based expression parsing not eval() so there is no code injection risk. An expression like tool_name in ['rm', 'unlink', 'rmdir'] is parsed as an abstract syntax tree and evaluated safely against the tool call context.

PII detection runs as a separate layer. It can use regex patterns for common formats like email addresses, phone numbers, and SSNs, or it can use Ollama with a local model for more nuanced detection. When PII is detected in a tool's output, it can be blocked or redacted before it reaches the agent.

Every enforcement decision allowed or blocked is written to a JSONL audit log with a timestamp, tool name, action taken, and the specific rule that triggered. The real-time dashboard reads from this audit log via WebSocket and shows violations, enforcement statistics, and the full constitution in one view.

Installation

# Clone the repository
git clone https://github.com/yourusername/agent-constitution.git
cd agent-constitution

# Install dependencies
pip install -e .

Quick Start

1. Create a Constitution
Start with a sample constitution to see the format, or create an empty one:

# Create a sample constitution
agent-constitution init --sample -o my_constitution.yaml

# Or create an empty one
agent-constitution init -o my_constitution.yaml

2. Validate Your Constitution
Before using it, validate that the YAML is well-formed and the expressions are safe:

agent-constitution validate my_constitution.yaml

3. Test Policy Enforcement
Check whether a specific tool call would be allowed or blocked before running it:

agent-constitution check rm --arg path=/tmp/test --constitution my_constitution.yaml

4. Start the Dashboard

agent-constitution dashboard --constitution my_constitution.yaml

Then open http://localhost:8000 in your browser.

Usage

Using the @enforce Decorator

The simplest integration is wrapping tool functions with the @enforce decorator. The enforcer checks the function against the constitution before it executes, if a rule blocks the call, a PolicyViolationError is raised before the function body runs.

from agent_constitution import Constitution, Enforcer

# Load constitution
constitution = Constitution.from_yaml("my_constitution.yaml")
enforcer = Enforcer(constitution=constitution)

@enforcer.enforce
def delete_file(path: str):
    """Delete a file."""
    import os
    os.remove(path)

# This will be blocked if rm/delete operations are restricted
try:
    delete_file("/tmp/test.txt")
except PolicyViolationError as e:
    print(f"Blocked: {e}")

Manual Policy Checking

For cases where you need to check a tool call without decorating a function, for example when the tool call is constructed dynamically the enforcer exposes a check method directly:

from agent_constitution import Constitution, Enforcer

constitution = Constitution.from_yaml("my_constitution.yaml")
enforcer = Enforcer(constitution=constitution)

# Check a tool call
result = enforcer.check(
    tool_name="curl",
    tool_args={"url": "https://example.com"},
    extra_context={"approved": False}
)

if result.blocked:
    print(f"Blocked by rule: {result.violations[0].rule_name}")
else:
    print("Allowed")

PII Detection

The PII detector can be used standalone - detect PII in any text, or redact it before it leaves the agent:

from agent_constitution.rules.pii_detector import PIIDetector

detector = PIIDetector()

# Detect PII in text
text = "Contact me at john@example.com or call 555-123-4567"
matches = detector.detect(text)

for match in matches:
    print(f"Found {match.pattern_name}: {match.matched_text}")

# Redact PII
redacted = detector.redact(text)
print(redacted)  # "Contact me at [REDACTED] or call [REDACTED]"

Audit Logging

The audit logger writes every enforcement decision to a JSONL file and supports log rotation. Logs can be read back programmatically:

from agent_constitution.audit import AuditLogger

logger = AuditLogger(log_path="./audit.jsonl")

# Log an event
logger.log(
    event_type="tool_call",
    tool_name="rm",
    action="block",
    allowed=False,
    violations=[{"rule_name": "block_file_deletion"}]
)

# Read logs
for entry in logger.read_logs(limit=10):
    print(f"{entry.timestamp}: {entry.event_type} - {entry.tool_name}")

Constitution Format

The constitution is a YAML file with versioning, named policies, and rules within each policy. Each rule has a name, a condition expression, an action, and a severity level:

version: "1.0"
name: "My Agent Constitution"
description: "Security policies for my AI agent"

policies:
  - name: tool_restrictions
    description: "Restrict access to dangerous tools"
    priority: 10
    rules:
      - name: block_file_deletion
        description: "Prevent file system deletion operations"
        condition: "tool_name in ['rm', 'unlink', 'rmdir']"
        action: block
        severity: critical

      - name: restrict_network_access
        description: "Limit unrestricted network access"
        condition: "tool_name == 'curl' and not context.get('approved', False)"
        action: notify
        severity: high

  - name: data_protection
    description: "Protect sensitive data"
    priority: 5
    rules:
      - name: pii_detection
        description: "Detect and protect PII in outputs"
        condition: "pii_detected == True"
        action: block
        severity: high

pii_config:
  enabled: true
  patterns: ["email", "ssn", "phone"]
  use_ollama: true
  ollama_model: "gemma3:4b"
  ollama_url: "http://localhost:11434"

audit_config:
  enabled: true
  log_path: "./audit_logs.jsonl"
  max_file_size_mb: 100
  retention_days: 30

The priority field controls which policies are evaluated first. Higher priority runs first. The action field is either block which raises a PolicyViolationError or notify, which logs the event but allows the call through.

CLI Commands

The CLI covers the full lifecycle from creating and validating a constitution to inspecting audit logs and testing expressions:

# Initialize a constitution
agent-constitution init --sample

# Validate a constitution
agent-constitution validate my_constitution.yaml

# Display constitution contents
agent-constitution show my_constitution.yaml

# Check if a tool call would be allowed
agent-constitution check rm --arg path=/tmp/test --constitution my_constitution.yaml

# Start the dashboard
agent-constitution dashboard --constitution my_constitution.yaml

# View audit logs
agent-constitution audit --log-path ./audit.jsonl

# Show statistics
agent-constitution stats --constitution my_constitution.yaml

# Test expression evaluation
agent-constitution eval-expr "x > 5" --context x=10

Dashboard

The dashboard provides real-time monitoring via FastAPI, WebSocket, and a React frontend. It shows:

Policy violations
Audit logs
Constitution rules and policies
Enforcement statistics

Open http://localhost:8000 after starting with agent-constitution dashboard.

Architecture

agent_constitution/
├── constitution.py      # Pydantic models and YAML handling
├── enforcer.py          # Policy enforcement and @enforce decorator
├── audit.py            # JSONL audit logging
├── cli.py              # Click CLI interface
├── rules/
│   ├── evaluator.py    # AST-based expression evaluation
│   └── pii_detector.py # PII detection with regex/Ollama
└── dashboard/
    ├── server.py       # FastAPI + WebSocket server
    └── frontend/       # React + Tailwind dashboard

Each module has a single responsibility - constitution.py handles Pydantic models and YAML parsing, enforcer.py owns the @enforce decorator and manual check logic, audit.py handles JSONL writing and rotation, and the rules/ directory separates expression evaluation from PII detection.

Testing

The project has comprehensive test coverage with 84 unit tests, all passing:

# Run all tests
pytest tests/ -v

# All tests passing: 84/84 ✓

Test coverage includes constitution loading and YAML parsing, policy enforcement with the @enforce decorator, manual policy checking, PII detection for regex and patterns, audit logging with rotation, expression evaluation and security validation, and rule violation tracking and statistics.

Development

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run specific test file
pytest tests/test_evaluator.py -v

# Run linting
flake8 agent_constitution

# Run type checking
mypy agent_constitution

How I Built This Using NEO

This project was built using NEO. NEO is a fully autonomous AI engineering agent that can write code and build solutions for AI/ML tasks including AI model evals, prompt optimization and end to end AI pipeline development.

The requirement was a policy enforcement framework for AI agents, one that enforces behavioral rules at the code level rather than relying on prompt instructions, with PII detection, a JSONL audit trail, and a real-time monitoring dashboard. NEO implemented the full system across 10 implementation steps, resulting in a production-ready framework with 84 tests passing.

NEO built the Pydantic constitution models and YAML handling in constitution.py, the policy enforcer with the @enforce decorator in enforcer.py, the AST-based expression evaluator in rules/evaluator.py, the regex and Ollama-powered PII detector in rules/pii_detector.py, the JSONL audit logger with rotation in audit.py, the Click CLI with all eight commands in cli.py, and the FastAPI and WebSocket dashboard server with the React and Tailwind frontend.

How You Can Use and Extend This With NEO

Use it to enforce safety rules across any agent's tool calls.
Wrap any tool function with @enforcer.enforce and define the rules in a YAML constitution. The enforcement happens at the code level not in the prompt, so it cannot be bypassed by the agent's reasoning or by jailbreak attempts.

Use the audit log to build an observability layer for your agents.
Every enforcement decision lands in a JSONL file with a timestamp, tool name, action, and triggering rule. This gives you a structured, queryable record of everything your agent tried to do allowed or blocked, which is useful for debugging unexpected agent behaviour and for compliance requirements.

Use PII detection as a standalone layer before agent outputs reach users.
The PIIDetector works independently of the enforcer. You can run it on any text, agent responses, tool outputs, retrieved documents before they are displayed or stored, and redact sensitive information automatically.

Extend it with custom PII patterns.
The pii_config section of the constitution accepts a patterns list. New regex patterns for domain-specific sensitive data can be added to the constitution file without touching any code.

Extend it with additional rule conditions.
The AST-based evaluator supports arithmetic, comparisons, and context dictionary access. New conditions that reference additional context fields work immediately once those fields are passed as extra_context in the enforcer's check call.

Final Notes

Agent Constitution shifts AI agent safety from instructions to enforcement. Rules defined in a YAML file are evaluated at the code level on every tool call before the tool executes, so the safety layer is not part of the agent's reasoning but a hard boundary around it.

The code is at https://github.com/dakshjain-1616/Agent-Constitution
You can also build with NEO in your IDE using the VS Code extension or Cursor.
You can use NEO MCP with Claude Code: https://heyneo.com/claude-code

DEV Community