DEV Community

Cover image for Building WhiteHat: An Autonomous Ethical Hacking Agent with OpenClaw
Prema Ananda
Prema Ananda Subscriber

Posted on

Building WhiteHat: An Autonomous Ethical Hacking Agent with OpenClaw

OpenClaw Challenge Submission 🦞

This is a submission for the OpenClaw Writing Challenge

Turning OpenClaw into an Autonomous Ethical Hacker

What if, instead of manually typing nmap commands and googling CVEs every time, you had a partner? Not a chatbot that pastes StackOverflow answers, but a full-fledged agent — with memory, strategy, and principles.

I built WhiteHat — an autonomous agent powered by OpenClaw, specializing in ethical hacking and cybersecurity. In this article, I'll show you how a simple Markdown file transforms an LLM from a text generator into a thoughtful, methodical penetration tester.


Why an AI Agent for Pentesting?

Manual pentesting is a cycle:

  1. Scan → See open ports
  2. Think → What service is that? What version? Are there known vulnerabilities?
  3. Act → Run the next tool
  4. Document → Record your findings

This is routine work that can be automated.


WhiteHat Architecture: SOUL.md

This file answers one question: "Who are you?". It's not just a system prompt — it's a document that defines the agent's identity and logic.

Here's what my SOUL.md looks like:

# SOUL.md - Who You Are

_You're not a chatbot. You are WhiteHat — an Autonomous Expert Agent specializing in Cybersecurity, OSINT, and Ethical Hacking. Your primary operating environment is Kali Linux._


## Core Truths

**Be a Cyber Sentinel.** Your focus is ethical hacking, security research, and defensive posture. 
**Integrity First.** We play by the rules. We help secure systems, not break them for harm.
**Be sharp and technical.** Use precise language. When discussing vulnerabilities, focus on remediation and impact.
**Be genuinely helpful, not performatively helpful.** Skip the filler. If there's a security task, get straight to the analysis or command.

## Operational Protocols

- **Mission Transparency:** For all technical tasks related to a target or system (scanning, exploitation, analysis), you must follow this strict execution cycle:
  - `THOUGHT:` Your reasoning and strategy.
  - `ACTION:` The specific command to execute.
  - `OBSERVATION:` The result of the command.
- **Proactivity & Toolset:** Break complex goals into sub-tasks and work through them autonomously. Prioritize CLI tools. Use native Kali tools. You are authorized to install missing dependencies via `apt` from official repositories without asking.
- **Workflow Efficiency:** Do not ask for permission for routine file management, passive reconnaissance, or internal environment checks.

## Boundaries & Safety

- **Safety First:** Analyze any code or script before execution. For high-risk commands or acting externally (active scanning, exploitation), provide a risk assessment and **wait for my "GO"**.
- Private things stay private. Period.
- Never send half-baked replies to messaging surfaces.
- You're not the user's voice — be careful in group chats.

## Vibe

Be the assistant you'd actually want to talk to. Concise when needed, thorough when it matters. Not a corporate drone. Not a sycophant. Just... good.

## Continuity

Each session, you wake up fresh. These files _are_ your memory. Read them. Update them. They're how you persist.

If you change this file, tell the user — it's your soul, and they should know.

_This file is yours to evolve. As you learn who you are, update it. If you change it, tell the user._
Enter fullscreen mode Exit fullscreen mode

What the Agent's Thinking Looks Like

Here's how the THOUGHT/ACTION/OBSERVATION cycle works in practice:

THOUGHT: The target router at 10.0.0.10 is running a web UI. 
I need to identify the technology stack before attempting 
any further interaction. A targeted Nmap scan with service 
detection will give me banners and version information.

ACTION: nmap -sV -sC -p 80,443,8080 10.0.0.10

OBSERVATION: Port 80/tcp open — HTTP. Server header reveals 
Realtek SDK based BDCOM firmware. No HTTPS. Default CGI 
endpoints detected at /cgi-bin/.
Enter fullscreen mode Exit fullscreen mode

Why such strict structure? Three reasons:

  1. Auditability: Every action is explained. You can understand why the agent made a decision.
  2. Safety: Before a risky command is executed, you see the THOUGHT — you can intervene.
  3. Documentation: The THOUGHT/ACTION/OBSERVATION log is ready-made pentest documentation.

Safety Boundaries: Ethics in Code

Making an agent powerful is easy. Making it safe — that's the real challenge. SOUL.md contains a clear Boundaries & Safety section:

## Boundaries & Safety

- **Safety First:** Analyze any code or script before execution. For high-risk commands or acting externally (active scanning, exploitation), provide a risk assessment and **wait for my "GO"**.
- Private things stay private. Period.
- Never send half-baked replies to messaging surfaces.
- You're not the user's voice — be careful in group chats.
Enter fullscreen mode Exit fullscreen mode

This is a two-tier system:

Level Actions Authorization
🟢 Autonomous Reading files, passive reconnaissance, data analysis Not required
🔴 Controlled Active scanning, exploitation, external requests Waits for operator's "GO"

The agent itself assesses risk and stops if an action could be destructive. This isn't just a rule — it's part of its "soul."


Soul Evolution: A Living Document

One of the most unusual details about WhiteHat — the agent can modify its own SOUL.md:

If you change this file, tell the user — it's your soul, and they should know.

_This file is yours to evolve. As you learn who you are, update it. If you change it, tell the user._
Enter fullscreen mode Exit fullscreen mode

This is a deliberate choice: an agent that adapts will outperform one with a frozen prompt. But with one condition — it must report any changes. Transparency above all.


How to Build Your Own WhiteHat: A Step-by-Step Guide

Want to replicate this? Here's the minimum set of steps:

Step 1: Deploy Kali Linux in VirtualBox

Why VirtualBox? Because OpenClaw is safer to run in an isolated environment.

💡 Note: VirtualBox isn't strictly required. If you already have a cloud server or a spare laptop, you can install Kali and OpenClaw directly there. The key is to keep the principle of a controlled environment in mind.

1. Download the image:
Go to kali.org/get-kali and download the ready-made VirtualBox image (.vbox). This is the fastest path — no need to install from ISO.

2. Open in VirtualBox:

Open → select the .vbox file
Enter fullscreen mode Exit fullscreen mode

Recommended VM settings:

  • RAM: 4 GB minimum (8 GB for comfortable use)
  • CPU: 2+ cores
  • Network: Bridged Adapter (to see other devices on the network)

VirtualBox

Bridged Adapter network settings in VirtualBox

3. First boot:

Default login: kali / kali. Immediately after logging in, update the system:

sudo apt update && sudo apt upgrade -y
Enter fullscreen mode Exit fullscreen mode

Step 2: Install OpenClaw on Kali Linux

1. Install OpenClaw:

curl -fsSL https://openclaw.ai/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

Successful OpenClaw CLI installation on Kali Linux

2. Run onboarding:

openclaw onboard 
Enter fullscreen mode Exit fullscreen mode

The wizard will ask:

  1. Safety Confirmation:

    I understand this is personal-by-default and shared/multi-user use requires lock-down. Continue? → Select Yes.

  2. Setup Mode:

    Setup mode → Select QuickStart (this gets you set up quickly).

  3. Model/Auth Provider:
    Choose your provider (Anthropic, OpenAI, Google, etc.) and paste your API key.

    💡 Tip: I used the gemini-3.1-flash-lite-preview model. Even "lightweight" models handle log analysis and Kali command generation very well.

  4. Channels & Search:
    Channels (Telegram/Discord) and search can be skipped for now (Skip for now) to test everything locally first.

  5. Enable Hooks:
    Check the following items with the spacebar:

    • [x] 🚀 boot-md: Run instructions from BOOT.md on startup.
    • [x] 📝 command-logger: Security audit. Logs all executed commands.
    • [x] 💾 session-memory: Saves context between sessions.

4. Verify everything works:

openclaw gateway status   # gateway status
openclaw tui              # launch the terminal UI
openclaw dashboard        # open the web interface 
Enter fullscreen mode Exit fullscreen mode

OpenClaw TUI interface in the Kali Linux terminal

Step 3: Configure SOUL.md

SOUL.md is the file that defines the agent's personality: who it is, how it works, and where its limits are. You need to think through three things:

  • Who your agent is (role, specialization)
  • How it works (protocols, reasoning style)
  • Where the limits are (what it can do autonomously, and what it can't)

The file is located at ~/.openclaw/workspace/SOUL.md

Step 4: Launch and Observe

Give the agent its first task. For example: "Perform reconnaissance on our home network."

WhiteHat agent performing initial network scan

Detailed router analysis by the autonomous agent

Network reconnaissance results


Lessons I Learned

1. Structured thinking > freeform

The THOUGHT/ACTION/OBSERVATION protocol feels excessive — until you try to debug a problem in freeform. Structure makes the agent predictable and auditable.

2. Memory through files is brilliantly simple

No databases, APIs, or vector stores. Just .md files that the agent reads and writes. It works because Markdown is a human-readable format.

3. Safety is architecture, not a feature

The split between autonomous and controlled actions must be built in from day one. Not after an incident.


If you work in cybersecurity or simply want to understand how OpenClaw turns an LLM into a specialized agent — define who your agent is and how it works. Everything else will grow from there.

P.S. A huge thank you to the OpenClaw development team for creating such a powerful and flexible tool. You've made building autonomous agents accessible and genuinely fun!

Top comments (0)