DEV Community

nghiahsgs
nghiahsgs

Posted on

I Built an AI Agent That Governs Itself — Separation of Powers for LLMs

Most AI agents today work like interns with root access — powerful but ungoverned. They can call any tool, access any file, and execute any command. We trust them because... well, we hope the LLM "knows better."

I didn't like that. So I built LawClaw — an AI agent governed by a separation of powers framework, inspired by constitutional law.

The Problem

Every AI agent framework I've seen treats governance as an afterthought:

  • "Just add a system prompt telling it not to do bad things"
  • "Use a content filter on the output"
  • "Hope the model is aligned enough"

This is like running a country with no laws, no courts, and no constitution — just a king who promises to be good.

The Architecture: Three Branches of Power

LawClaw implements three distinct governance layers:

1. Constitution (Immutable Rules)

A constitution.md file that cannot be modified by the agent itself. It defines:

  • Fundamental rights of the owner
  • Boundaries of power (what the agent SHALL NOT do)
  • Resource limits (execution timeouts, iteration caps)
  • Transparency requirements (audit trails)

2. Legislative Branch (Laws)

Detailed laws that the owner can create, update, or repeal:

  • Conduct Law: How the agent communicates and behaves
  • Privacy Law: Data handling rules
  • Safety Law: Dangerous operation restrictions

These are more flexible than the constitution — the owner can adjust them as needs change.

3. Pre-Judicial Branch (Automated Enforcement)

A judicial layer that intercepts every tool call before execution and checks it against the constitution and laws. Think of it as automated traffic cameras:

  • Agent tries to delete critical files? Blocked.
  • Agent tries to expose an API key? Blocked.
  • Agent tries to modify the constitution? Blocked.

The agent doesn't need to be trusted. It needs to be governed.

Why This Matters

As AI agents get more capable, the question isn't "can we make them smarter?" — it's "can we make them accountable?"

With traditional approaches, you're relying on the LLM's training to avoid dangerous actions. With separation of powers, you have structural guarantees — the judicial branch blocks dangerous actions regardless of what the LLM "decides."

Key Design Principles

1. The agent cannot modify its own governance
Just like a citizen can't rewrite the constitution, LawClaw cannot edit its own rules. Only the human owner can.

2. Laws are transparent and auditable
Every tool call is logged. The owner can review the full audit trail at any time. No black boxes.

3. Enforcement is automated, not advisory
The judicial branch doesn't "suggest" blocking a dangerous action — it vetoes it. The agent then reports what happened transparently.

4. Separation prevents concentration of power
No single layer controls everything. The constitution limits the laws. The laws guide the agent. The judiciary enforces both.

What LawClaw Can Do (Within Its Laws)

  • Browse the web and research topics
  • Control a Chrome browser (with persistent sessions)
  • Manage cron jobs for scheduled tasks
  • Send and receive emails
  • Execute shell commands (within safety limits)
  • Work with git repos, create PRs, merge code
  • Fetch real-time data (crypto prices, weather, news)
  • Create 3D scenes with Blender
  • Write, edit, and debug code

All of this — governed. Every action checked. Every dangerous pattern blocked.

The Insight

We've spent thousands of years developing governance frameworks for humans. Constitutions, laws, courts, checks and balances — these aren't bugs in human civilization. They're features.

AI agents need the same thing. Not better prompts. Not bigger models. Better governance.

The agent doesn't need to be trusted. It needs to be governed.


LawClaw is a governed AI agent running as a Telegram bot. If you're interested in the separation of powers approach to AI agent safety, I'd love to hear your thoughts in the comments.

Top comments (1)

Collapse
 
nghiahsgs profile image
nghiahsgs

My repo : github.com/nghiahsgs/LawClaw
Pls give me 1 star if you like this. thank you !