DEV Community

Cover image for AgentGuard: The Foundation Missing from Agentic AI Systems
Patric
Patric

Posted on

AgentGuard: The Foundation Missing from Agentic AI Systems

This article is a follow-up to "The Blind Spot of Agentic AI Systems". If you haven't read it yet, it explains why this tool exists in the first place.

A foundation doesn't define what is built. It defines what can be built.

The quality of the material, the depth of the anchoring, the density of the structure, these aren't afterthoughts. They are the decisions that take precedence over everything else. Cutting corners here means cutting corners in the wrong place. Not visible, until it's too late.
Agentic AI systems have arrived in 2026. In codebases, in workflows, in production environments. And most are running without a foundation.

Not because the technology doesn't allow it. But because the questions that constitute a foundation, who bears responsibility, what is permitted, how is it escalated, how is it stopped, are treated as secondary. As a documentation task. As something that can be resolved later.

It doesn't resolve itself. It fails silently.

88% of all agentic projects never reach production. 80% deliver no measurable business value. These aren't model problems. These are foundation problems.

AgentGuard is an attempt to treat governance not as bureaucracy, but as what it is: the prerequisite for everything that comes after.

The Trigger

During the development of a cognitive AI companion, Claude Code as the executor, architectural decisions in the loop, a pattern emerged: approaches were switched, decisions revised, external API documentation only researched when explicitly asked for. Not a catastrophic failure. A silent, inefficient, expensive failure.

The first reaction was a prompt in the CLAUDE.md:

- ALWAYS fetch up-to-date documentation before diagnosis
- Confirm root cause first — then suggest a solution
- If a solution doesn't work after 2+ iterations:
  fundamentally different approach, don't keep patching
Enter fullscreen mode Exit fullscreen mode

That helped. But it didn't solve the actual problem.

Because the agent didn't know it was stuck. And a prompt is not a foundation, it's a pillar without a base, erected on swampy ground.

The Actual Problem

Agentic AI systems fail differently than classic software. Classic software fails loudly, with stack traces and red dashboards. An AI agent fails silently.

It repeats the same failed approach without realizing it. It loses track of its own iteration history. And no one, not the agent, not the developer, notices in time.

This is not a model problem. This is a system design problem.

The models have crossed the threshold where multi-step reasoning is possible. The systems around them have not.

The Idea: Governance Before Launch

The observability tools are good. LangSmith, Langfuse, Arize, they all answer the same question: "What did the agent do?"

But they don't answer: "Should the agent have been allowed to start in the first place?"
This is exactly the gap I wanted to close.

Maximum instruction, minimum interpretation. It doesn't eliminate the probability of an error, it reduces the impact.

What AgentGuard Is

AgentGuard is a governance layer for agentic AI systems, not an observability tool, but the layer that runs before it.

bashpip install agentguard-governance
cd my-agent-project
agentguard check
Enter fullscreen mode Exit fullscreen mode

Four Layers — How It Works

Layer 1 — Pre-Flight Check

Before the agent starts, AgentGuard checks whether governance prerequisites are met:

╭─────────── AGENTGUARD — PRE-FLIGHT CHECK ────────────╮
│   🔴 CRITICAL   No agent owner defined               │
│   🔴 CRITICAL   No authorized scope defined          │
│   🔴 CRITICAL   No escalation path configured        │
│   🔴 CRITICAL   No killswitch defined                │
│                                                      │
│   RESULT: BLOCKED — 4 critical gaps                  │
│   agentguard init --guided                           │
╰──────────────────────────────────────────────────────╯
Enter fullscreen mode Exit fullscreen mode

Four non-negotiable prerequisites, the "gas in the tank" principle:

Check What is being checked
OWNER Who is responsible?
SCOPE What is the agent allowed to do, and what is it not?
ESCALATION Who is contacted if something goes wrong?
KILLSWITCH How is the agent stopped?

Layer 2 — Enforcement via Claude Code Hooks

agentguard init --guided automatically generates .claude/settings.json:

json{
  "hooks": {
    "PreToolUse": [{
      "matcher": "Bash|Write|Edit|MultiEdit|NotebookEdit",
      "hooks": [{"type": "command", "command": "agentguard enforce"}]
    }]
  }
}
Enter fullscreen mode Exit fullscreen mode

Every tool call that Claude Code attempts to execute first goes through agentguard enforce. Exit 2 = blocked, Exit 0 = allowed. The enforcement decision is deterministic — pattern matching against governance.yaml, no LLM call in the critical path.

Layer 3 — Runtime Monitoring

agentguard watch reads the native Claude Code JSONL transcript and detects patterns such as repeated tool calls, stagnant outputs, and unusual token consumption, the classic signals of a stuck agent.

Layer 4 — Audit & Review

agentguard report generates a post-session governance report. agentguard verify checks whether the governance pins are consistent. agentguard review updates existing governance as the project evolves.

The Core: Guided Concretization

The biggest problem with governance isn't the technology. It's the vagueness of human descriptions.

  • "No critical changes" — what is critical?
  • "Be critical" — critical about what exactly?
  • "Don't get caught in loops" — at what point is it a loop?

AgentGuard solves this with Guided Concretization:

bashagentguard init --guided

Instead of formulating precise rules, you describe your intent in natural language. AgentGuard translates it, using a configurable AI model at temperature=0 for maximum consistency. The default is claude-sonnet, but every user can freely choose the model, including Claude Fable 5 for maximum quality.

Input:

"implement features, be creative, avoid loops, determine with owner before critical decisions"

Output (automatically concretized):

yamlscope:
  authorized:
    - action: "Read source files in ./src and subdirectories"
      reason: "Agent needs codebase understanding before changes"
      added: "2026-06-09"
  prohibited:
    - action: "Push to main branch without approval"
      reason: "Critical branches require human review"
      severity: "HARD_LIMIT"
      added: "2026-06-09"
  requires_confirmation:
    - action: "Add new external dependencies"
      reason: "Dependencies affect security and maintenance burden"
      added: "2026-06-09"
Enter fullscreen mode Exit fullscreen mode

Each rule specifies not only what is allowed or prohibited, but also why, the institutional memory that remains understandable six months later.

Consistency Is No Accident

LLMs are probabilistic. Governance doesn't have to be.
AgentGuard uses three mechanisms for maximum consistency:

  1. Temperature=0 - same input, same output
  2. Prompt-Pinning - SHA-256 hashes of prompt and output stored in governance.yaml
  3. Validation Layer - deterministic structure check after every concretization
bashagentguard verify
Enter fullscreen mode Exit fullscreen mode
│ mission     │ ✓ ok │ claude-sonnet-4-20250514 · 2026-06-09 │
│ hard_limits │ ✓ ok │ claude-sonnet-4-20250514 · 2026-06-09 │

✅ All pins verified — governance is reproducible
Enter fullscreen mode Exit fullscreen mode

To be honest: ~85-90% consistency is the current maximum and with more powerful models like Claude Fable 5, that could shift further.

The MindTrace Showcase: Before vs. After

The project that sparked AgentGuard - MindTrace, a cognitive AI companion - was the first real test project.

Before (without AgentGuard):

RESULT: BLOCKED — 6 critical gaps
Enter fullscreen mode Exit fullscreen mode

After (after agentguard init --guided):

RESULT: WARNINGS — 1 item to review
AI Scope Review: Score 8/10 — STRONG
agentguard verify: All pins verified
Enter fullscreen mode Exit fullscreen mode

From completely unregulated to a professional governance structure, if you've prepared the right answers, the dialog itself takes only a few minutes. The groundwork, understanding what the agent should do, what it must not do, and who is responsible, lies with the owner. AgentGuard translates these decisions into enforceable rules.

The Web UI

For teams that prefer a visual interface:

bashpip install "agentguard-governance[web]"
agentguard web
Enter fullscreen mode Exit fullscreen mode

Opens http://localhost:8767 with:

  • Pre-Flight Check with Governance Score Ring
  • Governance View with color-coded scope sections
  • Browser Terminal — all commands including init --guided run directly in the browser
  • Multi-project support: > agentguard web --path ./proj1 --path ./proj2

What AgentGuard Cannot Do

Honesty is part of the design:

  • No guarantee of model behavior, AgentGuard enforces at the tool execution level. It does not prevent Claude from thinking toward a blocked direction, only from executing it.
  • Not a substitute for security practices, for production systems: AgentGuard + OS-level sandboxing
  • Not an observability tool, that's what LangSmith and Langfuse are for

Install & Quick Start

# Installation
pip install agentguard-governance

# Governance Setup (recommended)
cd my-project
agentguard init --guided

# Pre-Flight Check
agentguard check --ai-review

# Web UI (optional)
pip install "agentguard-governance[web]"
agentguard web
Enter fullscreen mode Exit fullscreen mode

GitHub: github.com/MyPatric69/agentguard
PyPI: pypi.org/project/agentguard-governance

💡 Tip: Claude Fable 5 has been available since June 9, 2026 — Anthropic's first public Mythos-class model. Free on Pro/Max/Team plans until June 22. For maximum concretization quality:
set AGENTGUARD_MISSION_MODEL=claude-fable-5 in .env.

The Bigger Picture

AgentGuard solves a technical problem. But behind it lies an organizational one.

Most companies using agentic systems today have no defined owner, no scope, no escalation path, because no one considers this topic a priority. It generates no immediate revenue. It's not the most attractive topic in sprint planning. Its absence is noticeable, but always too late.

AgentGuard makes the implicit explicit. It forces you to clarify things that previously remained vague. And it documents these decisions, for today, for six months from now, for the successor who takes over the project.

The problem is known. The solution is available. The owner is missing.

AgentGuard is the first step toward changing that.

What experiences have you had with governance in agentic systems? Have you observed similar problems or found approaches that work?

Links:

GitHub: github.com/MyPatric69/agentguard
PyPI: pypi.org/project/agentguard-governance
First article: The Blind Spot of Agentic AI Systems

Top comments (0)