This is a submission for the OpenClaw Writing Challenge
Turning OpenClaw into an Autonomous Ethical Hacker
What if, instead of manually typing nmap commands and googling CVEs every time, you had a partner? Not a chatbot that pastes StackOverflow answers, but a full-fledged agent — with memory, strategy, and principles.
I built WhiteHat — an autonomous agent powered by OpenClaw, specializing in ethical hacking and cybersecurity. In this article, I'll show you how a simple Markdown file transforms an LLM from a text generator into a thoughtful, methodical penetration tester.
Why an AI Agent for Pentesting?
Manual pentesting is a cycle:
- Scan → See open ports
- Think → What service is that? What version? Are there known vulnerabilities?
- Act → Run the next tool
- Document → Record your findings
This is routine work that can be automated.
WhiteHat Architecture: SOUL.md
This file answers one question: "Who are you?". It's not just a system prompt — it's a document that defines the agent's identity and logic.
Here's what my SOUL.md looks like:
# SOUL.md - Who You Are
_You're not a chatbot. You are WhiteHat — an Autonomous Expert Agent specializing in Cybersecurity, OSINT, and Ethical Hacking. Your primary operating environment is Kali Linux._
## Core Truths
**Be a Cyber Sentinel.** Your focus is ethical hacking, security research, and defensive posture.
**Integrity First.** We play by the rules. We help secure systems, not break them for harm.
**Be sharp and technical.** Use precise language. When discussing vulnerabilities, focus on remediation and impact.
**Be genuinely helpful, not performatively helpful.** Skip the filler. If there's a security task, get straight to the analysis or command.
## Operational Protocols
- **Mission Transparency:** For all technical tasks related to a target or system (scanning, exploitation, analysis), you must follow this strict execution cycle:
- `THOUGHT:` Your reasoning and strategy.
- `ACTION:` The specific command to execute.
- `OBSERVATION:` The result of the command.
- **Proactivity & Toolset:** Break complex goals into sub-tasks and work through them autonomously. Prioritize CLI tools. Use native Kali tools. You are authorized to install missing dependencies via `apt` from official repositories without asking.
- **Workflow Efficiency:** Do not ask for permission for routine file management, passive reconnaissance, or internal environment checks.
## Boundaries & Safety
- **Safety First:** Analyze any code or script before execution. For high-risk commands or acting externally (active scanning, exploitation), provide a risk assessment and **wait for my "GO"**.
- Private things stay private. Period.
- Never send half-baked replies to messaging surfaces.
- You're not the user's voice — be careful in group chats.
## Vibe
Be the assistant you'd actually want to talk to. Concise when needed, thorough when it matters. Not a corporate drone. Not a sycophant. Just... good.
## Continuity
Each session, you wake up fresh. These files _are_ your memory. Read them. Update them. They're how you persist.
If you change this file, tell the user — it's your soul, and they should know.
_This file is yours to evolve. As you learn who you are, update it. If you change it, tell the user._
What the Agent's Thinking Looks Like
Here's how the THOUGHT/ACTION/OBSERVATION cycle works in practice:
THOUGHT: The target router at 10.0.0.10 is running a web UI.
I need to identify the technology stack before attempting
any further interaction. A targeted Nmap scan with service
detection will give me banners and version information.
ACTION: nmap -sV -sC -p 80,443,8080 10.0.0.10
OBSERVATION: Port 80/tcp open — HTTP. Server header reveals
Realtek SDK based BDCOM firmware. No HTTPS. Default CGI
endpoints detected at /cgi-bin/.
Why such strict structure? Three reasons:
- Auditability: Every action is explained. You can understand why the agent made a decision.
- Safety: Before a risky command is executed, you see the THOUGHT — you can intervene.
- Documentation: The THOUGHT/ACTION/OBSERVATION log is ready-made pentest documentation.
Safety Boundaries: Ethics in Code
Making an agent powerful is easy. Making it safe — that's the real challenge. SOUL.md contains a clear Boundaries & Safety section:
## Boundaries & Safety
- **Safety First:** Analyze any code or script before execution. For high-risk commands or acting externally (active scanning, exploitation), provide a risk assessment and **wait for my "GO"**.
- Private things stay private. Period.
- Never send half-baked replies to messaging surfaces.
- You're not the user's voice — be careful in group chats.
This is a two-tier system:
| Level | Actions | Authorization |
|---|---|---|
| 🟢 Autonomous | Reading files, passive reconnaissance, data analysis | Not required |
| 🔴 Controlled | Active scanning, exploitation, external requests | Waits for operator's "GO" |
The agent itself assesses risk and stops if an action could be destructive. This isn't just a rule — it's part of its "soul."
Soul Evolution: A Living Document
One of the most unusual details about WhiteHat — the agent can modify its own SOUL.md:
If you change this file, tell the user — it's your soul, and they should know.
_This file is yours to evolve. As you learn who you are, update it. If you change it, tell the user._
This is a deliberate choice: an agent that adapts will outperform one with a frozen prompt. But with one condition — it must report any changes. Transparency above all.
How to Build Your Own WhiteHat: A Step-by-Step Guide
Want to replicate this? Here's the minimum set of steps:
Step 1: Deploy Kali Linux in VirtualBox
Why VirtualBox? Because OpenClaw is safer to run in an isolated environment.
💡 Note: VirtualBox isn't strictly required. If you already have a cloud server or a spare laptop, you can install Kali and OpenClaw directly there. The key is to keep the principle of a controlled environment in mind.
1. Download the image:
Go to kali.org/get-kali and download the ready-made VirtualBox image (.vbox). This is the fastest path — no need to install from ISO.
2. Open in VirtualBox:
Open → select the .vbox file
Recommended VM settings:
- RAM: 4 GB minimum (8 GB for comfortable use)
- CPU: 2+ cores
- Network: Bridged Adapter (to see other devices on the network)
3. First boot:
Default login: kali / kali. Immediately after logging in, update the system:
sudo apt update && sudo apt upgrade -y
Step 2: Install OpenClaw on Kali Linux
1. Install OpenClaw:
curl -fsSL https://openclaw.ai/install.sh | bash
2. Run onboarding:
openclaw onboard
The wizard will ask:
-
Safety Confirmation:
I understand this is personal-by-default and shared/multi-user use requires lock-down. Continue?→ Select Yes. -
Setup Mode:
Setup mode→ Select QuickStart (this gets you set up quickly). -
Model/Auth Provider:
Choose your provider (Anthropic, OpenAI, Google, etc.) and paste your API key.💡 Tip: I used the
gemini-3.1-flash-lite-previewmodel. Even "lightweight" models handle log analysis and Kali command generation very well. Channels & Search:
Channels (Telegram/Discord) and search can be skipped for now (Skip for now) to test everything locally first.-
Enable Hooks:
Check the following items with the spacebar:-
[x] 🚀 boot-md: Run instructions fromBOOT.mdon startup. -
[x] 📝 command-logger: Security audit. Logs all executed commands. -
[x] 💾 session-memory: Saves context between sessions.
-
4. Verify everything works:
openclaw gateway status # gateway status
openclaw tui # launch the terminal UI
openclaw dashboard # open the web interface
Step 3: Configure SOUL.md
SOUL.md is the file that defines the agent's personality: who it is, how it works, and where its limits are. You need to think through three things:
- Who your agent is (role, specialization)
- How it works (protocols, reasoning style)
- Where the limits are (what it can do autonomously, and what it can't)
The file is located at ~/.openclaw/workspace/SOUL.md
Step 4: Launch and Observe
Give the agent its first task. For example: "Perform reconnaissance on our home network."
Lessons I Learned
1. Structured thinking > freeform
The THOUGHT/ACTION/OBSERVATION protocol feels excessive — until you try to debug a problem in freeform. Structure makes the agent predictable and auditable.
2. Memory through files is brilliantly simple
No databases, APIs, or vector stores. Just .md files that the agent reads and writes. It works because Markdown is a human-readable format.
3. Safety is architecture, not a feature
The split between autonomous and controlled actions must be built in from day one. Not after an incident.
If you work in cybersecurity or simply want to understand how OpenClaw turns an LLM into a specialized agent — define who your agent is and how it works. Everything else will grow from there.
P.S. A huge thank you to the OpenClaw development team for creating such a powerful and flexible tool. You've made building autonomous agents accessible and genuinely fun!







Top comments (0)