AI agents can now run shell commands, modify files, and deploy infrastructure.
But their logs can be edited after the fact.
GuardClaw is an experiment in cryptographic, tamper-evident execution logs for AI agents.
The Problem: AI agents are gaining real execution power
Modern AI assistants are no longer just answering questions.
They can now:
- run shell commands
- read and modify files
- interact with databases
- call APIs
- execute DevOps workflows
Frameworks like LangChain, AutoGen, and MCP make it easy to give AI agents real capabilities.
But this raises a simple question:
If an AI agent does something dangerous, how do we prove what actually happened?
Most systems rely on traditional logs.
The problem is that logs are mutable.
Anyone with access can edit them.
A simple example
Imagine an AI DevOps assistant runs the following command:
shell.exec("rm production.db")
Now someone edits the log file to hide it:
shell.exec("ls")
If you investigate later, the logs look normal.
The destructive command is gone.
There is no way to prove the log was edited.
The idea: tamper-evident execution logs
I built an open-source project called GuardClaw to experiment with a different approach.
Instead of normal logs, every action is written to a cryptographically signed ledger.
Each entry is:
- Canonicalized
- Linked to the previous entry using a SHA-256 hash
- Signed with an Ed25519 signature
- Appended to a JSONL ledger
This means:
- deleting entries breaks the chain
- editing entries breaks the signature
- reordering entries breaks the hash linkage
If the ledger is modified, verification fails.
A small demo
Install GuardClaw:
pip install guardclaw
Run a simple agent example:
from guardclaw import init_global_ledger, Ed25519KeyManager
from guardclaw.mcp import GuardClawMCPProxy
km = Ed25519KeyManager.generate()
init_global_ledger(key_manager=km, agent_id="demo-agent")
proxy = GuardClawMCPProxy("demo-agent")
def search(query):
return f"Results for: {query}"
proxy.register_tool("search", search)
proxy.call("search", query="AI governance")
GuardClaw writes a ledger automatically:
.guardclaw/ledger/ledger.jsonl
Verifying the ledger
Run:
python -m guardclaw verify .guardclaw/ledger/ledger.jsonl
Output:
VALID
ledger integrity confirmed
Now try editing the ledger file.
Delete one entry.
Run verification again.
python -m guardclaw verify .guardclaw/ledger/ledger.jsonl
Output:
INVALID
causal hash mismatch
ledger integrity violated
The modification is immediately detected.
Why this matters for AI agents
AI systems are increasingly executing actions autonomously:
- coding assistants modifying repositories
- DevOps agents deploying infrastructure
- security agents running scans
- trading bots executing transactions
In these environments, it becomes important to answer questions like:
- What did the agent actually do?
- When did it happen?
- Was the log modified afterward?
What an agent execution ledger looks like
Example execution recorded by GuardClaw:
Agent actions:
seq 0 tool.search("AI governance")
seq 1 tool.read_file("config.yaml")
seq 2 shell.exec("rm production.db")
GuardClaw ledger → cryptographically signed
Tamper-evident execution ledgers provide a way to verify agent actions cryptographically.
Integrating with AI systems
GuardClaw already includes adapters for:
- LangChain
- CrewAI
- MCP tool calls
This makes it possible to record:
- tool calls
- agent actions
- execution results
while the system runs normally.
Example: recording GPT tool calls
In a simple demo, an AI assistant calling tools produces a ledger like this:
seq 0 search → INTENT
seq 1 search → RESULT
Every step is signed and chained.
If the ledger is edited later, verification fails.
What I'm exploring
GuardClaw is still an early experiment.
I'm interested in exploring ideas like:
- verifiable AI execution logs
- agent accountability systems
- cryptographic audit trails for autonomous agents
If you're building AI agents or automation systems, I’d love to hear how you currently handle logging and auditing.
Project
GitHub:
https://github.com/viruswami5511/guardclaw
PyPI:
https://pypi.org/project/guardclaw/
Feedback welcome
I'm especially interested in feedback from people building:
- AI agents
- automation pipelines
- DevOps tooling
- security infrastructure
What kinds of auditing or accountability tools do you wish existed for AI systems?
Top comments (0)