DEV Community

Cover image for ๐Ÿฆ€ ZeroClaw Deep Dive ๐Ÿค– โ€” A Build-It-Yourself Guide ๐Ÿ“˜
Truong Phung
Truong Phung

Posted on

๐Ÿฆ€ ZeroClaw Deep Dive ๐Ÿค– โ€” A Build-It-Yourself Guide ๐Ÿ“˜

Source repository: https://github.com/zeroclaw-labs/zeroclaw (Rust, dual-licensed MIT OR Apache-2.0, Rust 2024 edition).

This is an opinionated, end-to-end walkthrough of how ZeroClaw is engineered, written so you can use it as a blueprint for building your own self-hosted, multi-channel, multi-provider AI agent runtime. It covers the philosophy, the architecture, the core abstractions, the security model, the agent loop, the SOP engine, the memory layer, operational concerns, and the build/governance discipline that holds the whole thing together.


๐Ÿ“‹ Table of Contents


๐Ÿงฉ 1. What ZeroClaw is in one paragraph

ZeroClaw is a single Rust binary you configure and run. It connects to ~20 LLM providers (Anthropic, OpenAI, Ollama, Gemini, Bedrock, OpenRouter, and any OpenAI-compatible endpoint), reaches the world through 30+ messaging channels (Discord, Telegram, Matrix, email, voice, webhooks, CLI), and acts through tools (shell, browser, HTTP, hardware GPIO, custom MCP servers). Everything runs on the user's machine with their keys, in their workspace. There is no SaaS, no telemetry, no license server.

The slogan โ€” "You own the agent. You own the data. You own the machine it runs on." โ€” is the foundational constraint that every other design decision falls out of.


๐Ÿงญ 2. The four design opinions (the only ones that matter)

Internalize these before writing a line of code. They are the project's philosophy in priority order:

  1. ๐Ÿ  You own it. Local-first. Binary on your box, keys in your config, history in your DB. Pull the power cord and it stops cleanly. No cloud tenancy.
  2. ๐Ÿ›ก๏ธ Security-first, with escape hatches. Default autonomy is Supervised (medium-risk asks, high-risk blocks). Sandboxes, command policies, workspace boundaries, and cryptographic tool receipts are on by default. A loud, obviously-named YOLO preset lets dev users opt out.
  3. ๐Ÿชถ Minimal โ€” in binary size, dependencies, and surface area. A microkernel architecture with feature flags so users only ship what they use. Terse, localized tool descriptions; no hidden personality prompts.
  4. ๐Ÿ”Œ Provider-agnostic. The brain is pluggable. Switching providers is a one-line config change. Fallback chains keep things alive when an upstream flakes.

๐Ÿšซ If your design conflicts with one of these, the design loses. Write them down at the top of your PHILOSOPHY.md before anything else.


๐Ÿ—๏ธ 3. The architecture in one picture

Trait-driven, layered, microkernel-shaped:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚            channels       gateway        ACP                 โ”‚
โ”‚          (30+ adapters)   (REST/WS)    (JSON-RPC)            โ”‚
โ”‚                        โ†“                                     โ”‚
โ”‚                   ZeroClaw runtime                           โ”‚
โ”‚         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚         โ”‚  agent   โ”‚ security โ”‚   SOP    โ”‚                   โ”‚
โ”‚         โ”‚   loop   โ”‚  policy  โ”‚  engine  โ”‚                   โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚              โ†“          โ†“           โ†“                        โ”‚
โ”‚          providers    tools      memory                      โ”‚
โ”‚         (Anthropic,  (shell,    (SQLite,                     โ”‚
โ”‚          OpenAI,     browser,    embeddings)                 โ”‚
โ”‚          Ollama,     HTTP,                                   โ”‚
โ”‚          ~20 more)   hardware)                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Enter fullscreen mode Exit fullscreen mode

There are three structural layers:

Layer Crates Job
Edge zeroclaw-channels, zeroclaw-providers, zeroclaw-gateway, zeroclaw-tools Talk to the outside world (LLMs, chat platforms, HTTP, FS, devices)
Core zeroclaw-runtime, zeroclaw-config, zeroclaw-api Agent loop, policy, schema, the kernel ABI
Support zeroclaw-memory, zeroclaw-infra, zeroclaw-macros, zeroclaw-tool-call-parser, zeroclaw-plugins, zeroclaw-hardware, zeroclaw-tui Cross-cutting utilities (tracing, derive macros, parsing, plugins, hardware HAL)

The kernel rule. zeroclaw-runtime depends only on the trait crate zeroclaw-api, never on concrete provider/channel/tool crates. Everything is wired in via factory functions at startup. You can rip out Discord without recompiling the agent loop.


๐Ÿ”Œ 4. The five core traits โ€” the entire ABI

crates/zeroclaw-api/src/ defines the entire extension surface. Every other crate depends on this one. No implementations live here, only traits and shared data types. This is what keeps the kernel tiny and testable.

4.1 ๐Ÿค– Provider โ€” an LLM backend

#[async_trait]
pub trait Provider: Send + Sync {
    fn name(&self) -> &str;
    async fn chat(&self, req: ChatRequest<'_>) -> Result<ChatResponse>;
    async fn stream(&self, req: ChatRequest<'_>) -> Result<BoxStream<'_, StreamEvent>>;
    fn supports_tool_calls(&self) -> bool;
    fn supports_streaming(&self) -> bool;
    // ... capability flags for thinking, multimodal, JSON mode, etc.
}
Enter fullscreen mode Exit fullscreen mode

Key support types: ChatMessage, ToolCall, ToolResultMessage, ConversationMessage, ChatResponse, TokenUsage, and StreamEvent (a stream of TextDelta, Thinking, ToolCallDelta, Done).

4.2 ๐Ÿ“ก Channel โ€” a messaging platform

#[async_trait]
pub trait Channel: Send + Sync {
    fn name(&self) -> &str;
    async fn send(&self, msg: SendMessage) -> Result<()>;
    async fn poll(&self) -> Result<Vec<ChannelMessage>>;       // for pull-based
    async fn run(&self, tx: mpsc::Sender<ChannelMessage>) -> Result<()>;  // for push-based
    fn supports_draft_updates(&self) -> bool;
    async fn ask_approval(&self, req: ChannelApprovalRequest) -> Result<ChannelApprovalResponse>;
}
Enter fullscreen mode Exit fullscreen mode

Approval lives on the channel: Telegram uses inline keyboard buttons, Slack uses Block Kit, Discord/Signal/WhatsApp embed a short token in the prompt and wait for <token> approve|deny|always. The Channel trait is the abstraction that makes that switch transparent to the runtime.

4.3 ๐Ÿ› ๏ธ Tool โ€” an agent-callable capability

#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &str;
    fn description(&self) -> &str;
    fn parameters_schema(&self) -> serde_json::Value;
    async fn execute(&self, args: serde_json::Value) -> Result<ToolResult>;
    fn spec(&self) -> ToolSpec { /* default impl */ }
}
Enter fullscreen mode Exit fullscreen mode

ToolResult { success: bool, output: String, error: Option<String> } โ€” that's the whole contract. Everything from shell, web_fetch, pdf_read, git_operations, mcp_tool to gpio_write is a Tool.

4.4 ๐Ÿ’พ Memory โ€” a conversation/fact store

#[async_trait]
pub trait Memory: Send + Sync {
    fn name(&self) -> &str;
    async fn store(&self, key: &str, content: &str, category: MemoryCategory, ...) -> Result<()>;
    async fn search(&self, query: &str, ...) -> Result<Vec<MemoryEntry>>;
    async fn export(&self, filter: ExportFilter) -> Result<Vec<MemoryEntry>>;  // GDPR data portability
    // ...
}
Enter fullscreen mode Exit fullscreen mode

MemoryCategory is Core | Daily | Conversation | Custom(String). Entries carry namespace, importance, superseded_by to support multi-agent isolation, prioritized retrieval, and conflict resolution.

4.5 โš™๏ธ RuntimeAdapter and Peripheral โ€” execution-environment and hardware

RuntimeAdapter lets you port the agent to a new execution environment (native, Docker, Cloudflare Workers, embedded). It declares capability flags (has_shell_access, has_filesystem_access, supports_long_running, memory_budget) plus a single behavioural hook build_shell_command. The agent loop queries those flags and disables tools the runtime can't honour.

Peripheral lets a hardware board (STM32 over serial, RPi GPIO via gpiod) expose its capabilities as Tools. Connect โ†’ use โ†’ disconnect lifecycle, with a health_check probe.


๐Ÿ”„ 5. The agent loop โ€” what actually happens between "user sends message" and "agent replies"

crates/zeroclaw-runtime/src/agent/loop_.rs (~7,500 lines โ€” yes, really, agent loops grow). Conceptually:

1. Channel adapter receives a platform-native event
   โ”œโ”€ decode โ†’ canonical envelope
   โ”œโ”€ dedup (ignore retries/restarts)
   โ””โ”€ pair-check (allowed_users, allowed_chats, IP allowlist)

2. Runtime: deliver_message(envelope)
   โ”œโ”€ Load conversation history + retrieved memory facts
   โ”œโ”€ Apply tool filtering (built-in always; MCP gated by `tool_filter_groups`)
   โ””โ”€ Compose system prompt + history + tool specs

3. Provider: chat(stream=true)
   โ”œโ”€ Stream TextDelta chunks โ†’ relay to channel as draft updates (if supported)
   โ”œโ”€ Stream Thinking chunks โ†’ never sent to model again, only to UI
   โ””โ”€ Stream ToolCall

4. SecurityPolicy.validate_tool_call(name, args, risk)
   โ”œโ”€ Allowed โ†’ run
   โ”œโ”€ Blocked โ†’ return ToolResult::Err to model, model can react
   โ””โ”€ Approval required โ†’ channel.ask_approval(prompt)
                          โ”œโ”€ Telegram: inline keyboard
                          โ”œโ”€ Slack: Block Kit
                          โ”œโ”€ Discord/WhatsApp: token reply
                          โ”œโ”€ CLI: inline prompt
                          โ””โ”€ ACP: session/update kind="approval_request"

5. Tool.execute(args)
   โ”œโ”€ Run inside OS-level sandbox (Landlock / Bubblewrap / Seatbelt / Docker)
   โ”œโ”€ Compute receipt = HMAC-SHA256(session_key, name||args||result||ts)
   โ””โ”€ Append receipt token to result text

6. Provider: chat(history + tool_result) โ†’ final TextDelta stream

7. Channel: send(final reply, threaded if applicable)

8. Memory: persist conversation + tool calls + receipts
Enter fullscreen mode Exit fullscreen mode

Critical properties to copy:

  • Streaming is end-to-end. Don't buffer the whole LLM response. Channels that report supports_draft_updates() get incremental edits to a single sent message; others flush on stream end.
  • Tool calls are mid-stream. The model can emit a ToolCall while still generating text. Pause the stream, validate, invoke, resume.
  • Iteration cap. ZeroClaw caps tool-use iterations per user message at DEFAULT_MAX_TOOL_ITERATIONS = 10 to prevent runaway loops. Always have one.
  • Cost budgeting. A ToolLoopCostTrackingContext is threaded through (via tokio::task_local!) so a per-session budget can break the loop early.
  • Cancellation tokens everywhere. tokio_util::sync::CancellationToken is on the SendMessage and threaded through tool execution so you can interrupt mid-flight without leaving zombies.
  • Task-locals for cross-cutting state. TOOL_LOOP_THREAD_ID, TOOL_LOOP_SESSION_KEY, TOOL_CHOICE_OVERRIDE are tokio::task_local!s set by the agent loop, read by tools and providers without polluting function signatures.

๐Ÿ”’ 6. The six-layer security model

This is the most important part to get right. ZeroClaw composes six independent layers, outer to inner โ€” each one is sufficient on its own as a defence; together they're defence-in-depth.

๐Ÿ”‘ Layer 1: Channel pairing & access control

Done at the channel adapter, before the runtime ever sees the event:
allowed_users, allowed_chats, IP allowlists for webhooks, device pairing.

๐ŸŽš๏ธ Layer 2: Autonomy level (the coarse knob)

[autonomy]
level = "supervised"   # "read_only" | "supervised" | "full"
Enter fullscreen mode Exit fullscreen mode
  • read_only โ€” observe only. file_read, memory_search, http GET, web_search. Public Q&A agents.
  • supervised (default) โ€” low-risk runs, medium asks, high blocks. Approval requests timeout at 300s and treat as denials.
  • full โ€” no approval gates, but workspace, sandbox, and command policies still enforce. For dev/CI/SOPs.

Per-tool overrides (these are the practical knob):

[autonomy.auto_approve]   tools = ["browser_open", "http"]   # always ok
[autonomy.always_ask]     tools = ["file_write", "shell"]    # always ask
[autonomy.never_allow]    tools = ["browser_automation"]     # always block
Enter fullscreen mode Exit fullscreen mode

Per-channel overrides let a public Bluesky channel run read_only while a private CLI runs supervised.

๐Ÿ“ Layer 3: Workspace boundary & path rules

[autonomy]
workspace_only  = true
forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh", "~/.aws"]
Enter fullscreen mode Exit fullscreen mode

forbidden_paths always blocks regardless of workspace_only. Symlink resolution happens before enforcement.

๐Ÿš Layer 4: Shell command policy

[autonomy]
allowed_commands   = ["git", "cargo", "grep", "find", "ls", "cat"]
forbidden_commands = ["shutdown", "reboot", "mkfs"]
Enter fullscreen mode Exit fullscreen mode

A pattern-matching validator runs before the command hits the shell โ€” looks for dangerous flags, pipelines, argument shapes (rm -rf /, :(){ :|: & };:, etc.). Blocks surface as a tool error the model can react to.

๐Ÿ“ฆ Layer 5: OS-level sandbox (mechanism, not policy)

Auto-detected at startup:

Platform Preferred order
Linux Landlock (kernel 5.13+) โ†’ Bubblewrap โ†’ Firejail โ†’ Docker โ†’ none
macOS Seatbelt (native) โ†’ Docker โ†’ none
Windows AppContainer (experimental) โ†’ Docker โ†’ none

The sandbox confines filesystem (workspace + /usr + /lib read-only + explicit extras), network (allow-domains list optional), env (only shell_env_passthrough passes through; *_TOKEN/*_SECRET/*_PASSWORD patterns never), and process limits (CPU/memory/subprocesses/wall-time).

๐Ÿ” Layer 6: Tool receipts (cryptographic audit log)

This is the most novel part. Every tool invocation produces an HMAC-SHA256 digest:

receipt = HMAC-SHA256(ephemeral_session_key, tool_name || args || result || timestamp)
appended as: [receipt: zc-receipt-<timestamp>-<base64url-digest>]
Enter fullscreen mode Exit fullscreen mode

The model sees receipts in its context but cannot forge them โ€” it doesn't have the key. This closes the deniability gap: the model cannot claim it ran a tool it didn't, and cannot fabricate a tool result. Receipts are chained (each includes the hash of the previous), so tampering invalidates the rest of the log. Cost: <1 ms per call, no new external deps.

Cite the paper: Basu, A. (2026). "Tool Receipts, Not Zero-Knowledge Proofs." The point is that you don't need ZK proofs for an in-process agent โ€” a symmetric MAC with an ephemeral per-session key is enough.

๐Ÿšง Additional gates (worth copying)

  • OTP gating โ€” [security.otp] gated_actions = ["shell", "browser"] requires a TOTP before each listed action. Useful for remote admin.
  • Emergency stop โ€” zeroclaw estop halts all in-flight tools. Resuming requires OTP if [security.estop] enabled = true.
  • Prompt-injection guard โ€” scans model output for known injection patterns before validation.
  • Leak detector โ€” scans outbound messages for secret-shaped tokens (API keys, private keys) and blocks. The detector is configured to pass through zc-receipt-* tokens.
  • Pairing guard โ€” device pairing for channel auth; prevents stolen creds from being used on a new device.

โ“ The "what about blocked calls?" question

A blocked call is not silent. The validator returns an error โ†’ runtime wraps it as ToolResult::Err โ†’ the model sees Error: Shell command blocked by policy: forbidden pattern 'rm -rf /' and can apologise/retry/escalate. If a channel restricts the toolset (tools_allow = [...]), the tool is simply not advertised to the model in the first place โ€” it never sees a tool it can't call.


๐Ÿค– 7. Provider strategy โ€” supporting 20+ LLMs without exploding

Look at crates/zeroclaw-providers/src/. The trick:

anthropic.rs         openai.rs         ollama.rs
gemini.rs            bedrock.rs        azure_openai.rs
copilot.rs           glm.rs            kilocli.rs
                  โ†“
        compatible.rs  โ† single OpenAI-compatible impl
                  โ†“
   Groq, Mistral, xAI, DeepSeek, Together, Fireworks,
   Perplexity, Cohere, Moonshot, Venice, OpenRouter, ...
Enter fullscreen mode Exit fullscreen mode

One compatible.rs adapter implements the OpenAI Chat Completions schema and is reused for ~20 providers โ€” most "new providers" are a config entry, not a new file. Only providers with genuinely different protocols (Anthropic Messages API, Gemini, Bedrock SigV4) get their own file.

Wrappers on top of providers:

  • router.rs โ€” multi-provider router that routes by task hint (e.g. reasoning โ†’ DeepSeek-R1, chat โ†’ Sonnet, vision โ†’ Gemini).
  • reliable.rs โ€” fallback chain wrapper. Failover triggers: HTTP 5xx, rate-limit, timeout, schema-validation failure.
  • schema.rs โ€” JSON-schema cleaner that normalizes tool schemas per provider. Gemini rejects minLength, pattern, $ref, additionalProperties; Anthropic doesn't resolve $ref; OpenAI is most permissive. The cleaner has named strategies (Gemini, Anthropic, OpenAI, Conservative) and gives you cross-provider tool portability for free.
  • Tool-call parser (zeroclaw-tool-call-parser) โ€” normalizes the model side. OpenAI-style tool_calls JSON, Anthropic <tool_use> blocks, Qwen XML, Ollama function-call formats โ€” all parsed into a single ParsedToolCall shape. Supports a dispatcher trait pattern (XmlToolDispatcher, JsonToolDispatcher) that strips <think>...</think> reasoning blocks before parsing.

Reasoning preservation. Some providers (DeepSeek-R1, GLM-4.7, Kimi K2.5) reject tool-call history that omits the reasoning_content field. ZeroClaw stores it as opaque pass-through on ChatResponse and ConversationMessage::AssistantToolCalls โ€” you don't decode it, you just round-trip it.


๐Ÿ’พ 8. Memory โ€” durable, queryable, GDPR-compliant

crates/zeroclaw-memory/:

  • Default backend: SQLite. Single-file, embedded, zero-ops.
  • Optional: PostgreSQL behind --features memory-postgres for multi-instance deployments needing shared concurrent writes (with pgvector for vector search).
  • Embeddings are pluggable (OpenAI, Ollama local, etc.).
  • Categories: Core (long-term facts/preferences), Daily (session logs), Conversation (raw history), Custom(name) for user-defined buckets.
  • Memory consolidation runs on a schedule โ€” summarizes long conversations, extracts facts, marks superseded entries.
  • GDPR Art. 20 export is built into the trait via ExportFilter. Namespace, session, category, time-range โ€” all filterable.

The memory layer is where you implement per-namespace isolation so multi-agent, multi-user deployments don't cross-contaminate.


โš™๏ธ 9. The SOP engine โ€” automation that isn't just chat

Standard Operating Procedures live in crates/zeroclaw-runtime/src/sop/. The model's contribution is loose-but-bounded:

  • Trigger sources: cron, MQTT, webhook, peripheral events, manual.
  • Execution modes:
    • Auto โ€” run all steps, no human.
    • Supervised (default) โ€” approval before starting.
    • StepByStep โ€” approval before each step.
    • PriorityBased โ€” Critical/High โ†’ Auto, Normal/Low โ†’ Supervised.
    • Deterministic โ€” sequential, no LLM round-trips, step outputs piped as inputs to next step (with optional checkpoint approvals). This is the killer feature: when the model has already figured out a workflow, replay it deterministically and save 100% of the LLM cost.
  • Concurrency: per-SOP concurrency limits + cooldown.
  • Resumability: runs are durable; restart picks up where it left off.

If you build something similar, design the SOP type so the same definition can be run in any mode โ€” that's how Deterministic mode "graduates" workflows from LLM-driven to mechanical.


๐Ÿ“ก 10. Channels โ€” one orchestrator, 30+ adapters

crates/zeroclaw-channels/src/orchestrator/ is the part that's worth reading. The adapters themselves (Discord, Telegram, Matrix, Slack, Signal, IRC, Bluesky, Nostr, WhatsApp, WeChat, Lark, DingTalk, LINE, IMessage, Email, Voiceโ€ฆ) are mostly mechanical platform wiring. The orchestrator handles the cross-cutting concerns that every channel needs:

  • Inbound dedup โ€” same message arriving twice from a retry.
  • Draft updates โ€” single sent-message edits as the LLM streams.
  • Multi-message splits โ€” long replies into sequential messages.
  • Threading โ€” thread_ts (Slack) / thread ID (Discord); interruption_scope_id (distinct from thread_ts!) for cancellation grouping.
  • Media pipeline โ€” MediaAttachments on ChannelMessage flow into transcription (audio) / vision (images) / extraction (PDFs) before the message reaches the agent loop.
  • Approval routing โ€” channel-appropriate UX (buttons vs token replies vs ACP RPC).
  • Link enrichment โ€” fetches OpenGraph metadata for URLs in inbound messages.

Build the orchestrator first, then add channels. It's the part that pays for itself with every new adapter.


๐Ÿ–ฅ๏ธ 11. Operational surfaces

ZeroClaw exposes the runtime through five distinct surfaces, in increasing intimacy:

  1. CLI (zeroclaw agent, zeroclaw onboard, zeroclaw service install, zeroclaw estop) โ€” a clap-derived command tree, with reference docs generated from the derives via cargo mdbook refs. The docs cannot drift from the binary.
  2. HTTP/WebSocket gateway (zeroclaw-gateway) โ€” REST for sessions/memory/status/cron, WS for streaming. Default-binds to 127.0.0.1; public bind requires explicit [gateway.allow_public_bind = true]. Pairing required by default.
  3. Web dashboard โ€” chat, memory browser, config editor, cron management, tool inspection, served by the gateway.
  4. ACP (Agent Client Protocol) โ€” JSON-RPC 2.0 over stdio, for IDE/editor integration. session/update events with kind = "approval_request" are how IDEs render approvals.
  5. Service registration โ€” zeroclaw service install writes a unit file for systemd, launchctl, or Windows Service. The service manages its own logs, restarts, and (on Linux) sandbox negotiation.

Plus, for distribution:

  • Tauri desktop app (apps/tauri/) โ€” wraps the runtime + dashboard in a native shell.
  • Docker Compose (docker-compose.yml + multiple Dockerfile.*).
  • Kubernetes (deploy-k8s/).
  • Firmware (firmware/) โ€” stubs for STM32/RPi peripheral boards.
  • install.sh โ€” single-script bootstrap that asks "prebuilt vs source?", handles --minimal (~6.6 MB kernel-only), feature-flag selection, and ends with zeroclaw onboard.

๐Ÿ“ 12. Configuration โ€” one TOML, every key documented

A single file at ~/.zeroclaw/config.toml. Layout:

[providers.models.default]
provider = "openrouter"
model    = "anthropic/claude-sonnet-4-6"
api_key  = "${OPENROUTER_API_KEY}"

[providers.fallback]
chain = ["default", "ollama-local"]

[autonomy]
level             = "supervised"
workspace_only    = true
allowed_commands  = ["git", "cargo", "grep"]
forbidden_paths   = ["/etc", "~/.ssh"]

[security.sandbox]
backend = "auto"          # landlock | bubblewrap | firejail | docker | seatbelt | noop
network = "allowed-domains"
allowed_domains = ["api.openai.com", "api.anthropic.com"]

[agent.tool_receipts]
enabled              = true
show_in_response     = false
inject_system_prompt = true

[channels.discord]
token            = "${DISCORD_TOKEN}"
allowed_users    = ["123456789"]
autonomy_level   = "supervised"   # per-channel override

[gateway]
bind             = "127.0.0.1:8080"
allow_public_bind = false

[memory]
backend  = "sqlite"
path     = "${ZEROCLAW_HOME}/memory.db"
embedder = "ollama:nomic-embed-text"
Enter fullscreen mode Exit fullscreen mode

Two principles to copy verbatim:

  • Configurable derive macro. crates/zeroclaw-macros/ exposes a derive that generates schema entries from struct definitions. The full config reference (docs/book/src/reference/config.md) is generated from the live schema. Docs cannot drift from code.
  • Secrets: never in-line, always ${ENV_VAR} interpolation, with an encrypted local secrets store (ChaCha20-Poly1305) for things you don't want in the environment.

๐Ÿท๏ธ 13. Stability tiers โ€” how a young project ships breaking changes safely

Every workspace crate carries a stability tier (per RFC #5574 โ€” Microkernel transition):

Tier Meaning
โœ… Stable Covered by breaking-change policy. Major-version bump required.
๐Ÿ”ถ Beta Breaking changes allowed in MINOR with changelog notes.
๐Ÿ”ฌ Experimental No stability guarantee. Iterate freely.

Tiers are promoted, never demoted, by deliberate team decision. Today's snapshot:

Crate Tier Stable target
zeroclaw-api Experimental v1.0.0
zeroclaw-config Beta v0.8.0
zeroclaw-providers Beta โ€”
zeroclaw-memory Beta โ€”
zeroclaw-tool-call-parser Beta v0.8.0
zeroclaw-channels Experimental v1.0.0 (plugin migration)
zeroclaw-tools Experimental v1.0.0 (plugin migration)
zeroclaw-runtime Experimental โ€”

Copy this idea. It's how you ship publicly, get users, and still iterate.


โš–๏ธ 14. Risk tiers โ€” how PRs are reviewed

The contribution doc carves changes into three risk classes. Use them for your review checklist:

  • ๐ŸŸข Low: docs / chore / tests-only.
  • ๐ŸŸก Medium: most crates/*/src/** behaviour changes without boundary or security impact.
  • ๐Ÿ”ด High: anything in runtime/security/, gateway/, tools/, .github/workflows/, access-control boundaries.

When uncertain, classify as higher risk. CI gating, required reviewers, and approval thresholds are tied to the tier.


๐Ÿ“ 15. Code conventions โ€” non-obvious rules that protect quality

Lifted from AGENTS.md. These are the rules that matter:

  1. ๐Ÿ”Œ Trait-driven extension. New provider/channel/tool = implement a trait, register in factory. Never patch the kernel.
  2. ๐Ÿšซ No speculative config. Don't add a config key without a concrete use case landing in the same PR.
  3. ๐ŸŒ Localize user-facing output. Every CLI message, tool description, onboarding prompt uses fl!() / Fluent strings. Bare string literals in user-visible paths are a CI failure.
  4. ๐Ÿ‡ฌ๐Ÿ‡ง English log/panic messages. tracing:: events and panics stay in English with stable error_key fields. Translation breaks log aggregation.
  5. โŒ No unwrap() / expect() in production paths. Propagate the error or document the invariant that makes panic impossible (with a comment).
  6. ๐Ÿ—‘๏ธ No dead-code suppression. Don't #[allow(dead_code)] or _var to silence the compiler. Delete the code, wire it into behaviour, or open a follow-up issue. Reserve underscore names for required-but-unused trait/callback parameters only.
  7. ๐ŸŽฏ One concern per PR. Mixed feature+refactor+infra patches are rejected. Stacked PRs declare Depends on #...; replacements declare Supersedes #....
  8. ๐Ÿ“ Conventional commits + size labels (size: XS/S/M). Small PRs, always.
  9. ๐Ÿ” Privacy. Never commit personal data, real identities, or production secrets โ€” even in examples or test fixtures. There's a "neutral-placeholder palette" in docs/.../privacy.md.
  10. ๐Ÿค AI-coding co-authorship is welcome but governed by RFC #5615 โ€” disclose the assistant, the human reviews, the human signs the commit.

๐Ÿ”ง 16. The toolchain & tooling stack

Concretely, this is the supporting cast you'll want:

Concern Tool
Async runtime tokio
HTTP reqwest
WebSocket tokio-tungstenite
Serialization serde, serde_json, toml
TLS rustls
Crypto chacha20-poly1305 (secrets), ring/hmac+sha2 (receipts)
SQLite rusqlite
CLI parsing clap (derive)
Terminal UI ratatui, dialoguer
Logging/metrics tracing + (optional) Prometheus + OpenTelemetry
Localization fluent
Build profile LTO + single codegen unit for release; faster CI variant
Lint/format cargo fmt, cargo clippy -D warnings
Lockfile-on-CI cargo build --locked, cargo test --locked
Pre-PR runner just ci (alias for fmt-check + lint + test)
Docs site mdbook with custom cargo mdbook refs/sync plugins
Fuzzing cargo fuzz
Benchmarks criterion (under benches/)
Cargo policy cargo-deny (deny.toml) โ€” license, advisory, banned-deps audit
Release automation release-plz (release-plz.toml)
Reproducible env flake.nix (Nix), .envrc (direnv), .actrc (act)
Editor integration .vscode/, .gemini/, .claude/skills/ for agents

๐Ÿ“š 17. Generating reference docs from code (so they never drift)

A pattern worth stealing wholesale:

  • docs/book/src/reference/cli.md is generated from the clap derives (cargo mdbook refs).
  • docs/book/src/reference/config.md is generated from the JSON schema produced by the Configurable derive macro.
  • The rustdoc-as-website page (api.md) is rebuilt from cargo doc --no-deps.
  • Translations in the docs book are seeded by AI fill (cargo mdbook sync calls Anthropic when ANTHROPIC_API_KEY is set), then human-reviewed via .po files.

Net effect: no handwritten reference page can describe a feature that doesn't exist. Wire your docs build into CI from day one.


๐Ÿ›๏ธ 18. Governance โ€” how decisions get made

  • RFC process for substantive changes (see docs/book/src/contributing/rfcs.md). Foundational ratified RFCs (#5574 microkernel, #5576 docs, #5577 governance, #5579 CI, #5615 AI co-authorship, #5653 zero-compromise error handling) read like a constitution.
  • Two-thirds majority for core-team votes (RFC #5577).
  • Public transparency documents under docs/book/src/foundations/ are protected files โ€” agent skills know not to move them.
  • Trademark policy carved out: code is MIT-or-Apache; the name and logo are trademarks of ZeroClaw Labs. Common pattern for projects that want OSS code without name impersonation.

โš ๏ธ 19. Anti-patterns to avoid (a checklist)

Straight from AGENTS.md, these are the mistakes you don't have to discover yourself:

  • โŒ Adding heavy dependencies for minor convenience.
  • โŒ Silently weakening security policy or access constraints.
  • โŒ Adding speculative config/feature flags "just in case".
  • โŒ Mixing massive formatting-only changes with functional changes.
  • โŒ Modifying unrelated modules "while here".
  • โŒ Bypassing failing checks without explicit explanation.
  • โŒ Hiding behaviour-changing side effects in refactor commits.
  • โŒ Suppressing unused production code with _ prefixes or #[allow(dead_code)].
  • โŒ Leaving unwrap() / expect() in production paths.
  • โŒ Including personal/identifying info in test data, examples, or commits.
  • โŒ Calling concrete provider/channel/tool types from the kernel โ€” go through the trait.

๐Ÿ—บ๏ธ 20. A practical roadmap for building your own ZeroClaw-shaped system

If you're starting fresh, do it in this order. Don't skip steps; each one reveals constraints for the next.

๐ŸŒฑ Phase 0 โ€” Foundations (week 0)

  1. Pick a language with strong async, FFI, and a trait/interface system. (Rust is what ZeroClaw uses; Go works; TypeScript/Node works for v0 but won't reach the same binary-size/perf bar.)
  2. Write your PHILOSOPHY.md โ€” three or four opinions, prioritized.
  3. Set up the workspace: cargo workspace (or equivalent monorepo), with one traits crate (*-api) and one runtime crate.
  4. Wire CI from commit #1: format + lint + test + lockfile + license/advisory audit.

๐Ÿ”Œ Phase 1 โ€” The kernel ABI (week 1)

  1. Define Provider, Channel, Tool, Memory, RuntimeAdapter traits in the API crate. Resist adding implementations here.
  2. Write the data types: ChatMessage, ToolCall, ToolResult, ChannelMessage, SendMessage, MemoryEntry, StreamEvent. Don't be afraid to evolve them โ€” they're the bedrock.
  3. Stub a noop impl for each trait so the runtime can compile end-to-end.

๐Ÿ• Phase 2 โ€” One vertical slice (weeks 2โ€“3)

  1. One provider (start with OpenAI-compatible โ€” it's the lingua franca; you'll get 20+ providers later for free).
  2. One channel (start with CLI โ€” no auth, no webhooks, instant feedback).
  3. One tool (echo, then web_fetch).
  4. One memory backend (SQLite).
  5. Wire the agent loop end-to-end: receive โ†’ load history โ†’ call provider โ†’ stream โ†’ tool call โ†’ invoke โ†’ feed back โ†’ reply โ†’ persist.
  6. Streaming is hard. Get it right early.

๐Ÿ”’ Phase 3 โ€” Security model (week 4)

  1. Implement autonomy levels (ReadOnly/Supervised/Full) and the per-tool overrides.
  2. Add the workspace boundary check and forbidden-paths list.
  3. Add the shell command policy with allow/deny lists and the pattern validator.
  4. Wire the channel approval prompt โ€” even if it's just CLI's inline prompt, the abstraction has to work for future channels.
  5. Implement tool receipts with HMAC-SHA256 and an ephemeral session key. This is cheap and changes the agent's accountability profile.

๐Ÿ“ฆ Phase 4 โ€” Sandbox (week 5)

  1. Auto-detect Landlock on Linux, Seatbelt on macOS, Docker as universal fallback.
  2. Make the sandbox an opt-out (not an opt-in). Default-on, noop for YOLO mode.

๐Ÿ“ˆ Phase 5 โ€” Scale the edges (weeks 6โ€“10)

  1. Add a second provider (the OpenAI-compat one โ€” confirm your trait is right).
  2. Add Anthropic (because its protocol is genuinely different โ€” confirm again).
  3. Add the schema cleaner when Gemini support arrives (it'll fail without it).
  4. Add fallback chain + router.
  5. Add channels in this order: webhook โ†’ Discord โ†’ Telegram โ†’ Matrix โ†’ email โ†’ voice. Each one stresses the orchestrator differently.

๐Ÿš€ Phase 6 โ€” Production posture (weeks 10โ€“14)

  1. Service registration (systemd / launchctl / Windows Service).
  2. HTTP/WS gateway + web dashboard. Default-bind localhost, require explicit public-bind.
  3. Pairing flow + WebAuthn for the dashboard.
  4. Cron + SOP engine. Build the deterministic execution mode early โ€” it pays for itself when one workflow runs 1,000ร—.
  5. Observability: tracing, optional Prometheus, OpenTelemetry export.

๐ŸŒ Phase 7 โ€” Long tail

  1. ACP (JSON-RPC over stdio) for IDE integration.
  2. Hardware support behind a feature flag.
  3. Plugin system (WASM is the right answer if you want third-party tools).
  4. Tauri desktop app.
  5. Translations.
  6. Microkernel split: keep promoting Experimental crates โ†’ Beta โ†’ Stable.

๐Ÿšจ 21. Things that are easy to underestimate

Read this list when you're about to ship "v0.1":

  • Streaming + tool calls. The model will emit a tool call mid-stream. The channel may not support draft updates. Both have to work.
  • Approval UX per channel. Buttons in Slack, tokens in WhatsApp, RPC in ACP. The abstraction (Channel::ask_approval) has to be channel-shaped, not runtime-shaped.
  • Iteration cap. A model in a tool-call loop can burn $100 in 30 seconds. Cap iterations and budget per session. Always.
  • Sandbox on macOS. Seatbelt is fine but some Homebrew-linked binaries don't cooperate. Have Docker as a documented fallback.
  • Schema cleaning per provider. Gemini rejects keywords OpenAI accepts. Do the cleaner once, centrally.
  • Reasoning content round-tripping. If you don't preserve reasoning_content opaquely, DeepSeek-R1 / GLM-4.7 will reject your tool-call follow-ups.
  • Receipts cohabit with the leak detector. Make sure your secret-redactor passes receipt tokens through.
  • Cancellation propagation. Cancellation tokens have to reach the HTTP client, the sandbox process, and the channel send. Threading them through tokio::task_local! is cleaner than threading them through every signature.
  • Conversation pruning. Histories grow unbounded. ZeroClaw has history_pruner.rs, context_compressor.rs, fast_trim_tool_results, emergency_history_trim. Plan for this from day one โ€” token budgets are the real ceiling.
  • Migration story. Schema versioning + migration is in crates/zeroclaw-config/. Once users have configs, you can't break them silently.
  • Generated docs. If the docs are hand-written, they will drift. Generate the reference; it pays back tenfold.

๐ŸŒฑ 22. The minimal viable lattice (what you must have)

If you copy only one thing from ZeroClaw, copy this skeleton:

your-agent/
โ”œโ”€โ”€ crates/
โ”‚   โ”œโ”€โ”€ your-api/         # traits only โ€” Provider, Channel, Tool, Memory
โ”‚   โ”œโ”€โ”€ your-runtime/     # agent loop + security + SOP
โ”‚   โ”œโ”€โ”€ your-config/      # TOML schema + secrets
โ”‚   โ”œโ”€โ”€ your-providers/   # implementations behind feature flags
โ”‚   โ”œโ”€โ”€ your-channels/    # implementations behind feature flags
โ”‚   โ”œโ”€โ”€ your-tools/       # implementations behind feature flags
โ”‚   โ””โ”€โ”€ your-memory/      # SQLite default
โ”œโ”€โ”€ docs/book/            # mdbook with refs generated from clap + schema
โ”œโ”€โ”€ install.sh            # one-script install with --minimal / --source / --prebuilt
โ”œโ”€โ”€ Justfile              # fmt, lint, test, ci, docs, build
โ”œโ”€โ”€ deny.toml             # license + advisory audit
โ”œโ”€โ”€ flake.nix             # reproducible dev env
โ”œโ”€โ”€ AGENTS.md             # rules for AI coding assistants
โ”œโ”€โ”€ CLAUDE.md             # project-specific assistant guidance
โ”œโ”€โ”€ .env.example          # every supported provider listed (commented)
โ””โ”€โ”€ README.md             # philosophy โ†’ install โ†’ quick start โ†’ architecture
Enter fullscreen mode Exit fullscreen mode

Everything else is iteration on this shape.


๐Ÿ“– 23. References (read these next)

In rough order of "highest signal first":

  • The repo: https://github.com/zeroclaw-labs/zeroclaw
  • Philosophy doc โ€” docs/book/src/philosophy.md
  • AGENTS.md โ€” root of the repo
  • Architecture overview โ€” docs/book/src/architecture/overview.md
  • Request lifecycle โ€” docs/book/src/architecture/request-lifecycle.md
  • Security overview โ€” docs/book/src/security/overview.md
  • Tool receipts โ€” docs/book/src/security/tool-receipts.md (and the cited paper)
  • Autonomy levels โ€” docs/book/src/security/autonomy.md
  • Sandboxing โ€” docs/book/src/security/sandboxing.md
  • The trait crate โ€” crates/zeroclaw-api/src/{provider,channel,tool,memory_traits,runtime_traits,peripherals_traits,schema,agent}.rs
  • The agent loop โ€” crates/zeroclaw-runtime/src/agent/loop_.rs
  • The dispatcher โ€” crates/zeroclaw-runtime/src/agent/dispatcher.rs
  • The SOP engine โ€” crates/zeroclaw-runtime/src/sop/{engine,types,condition,dispatch}.rs
  • The compatible-provider adapter โ€” crates/zeroclaw-providers/src/compatible.rs
  • The router and fallback โ€” crates/zeroclaw-providers/src/{router,reliable}.rs
  • The schema cleaner โ€” crates/zeroclaw-api/src/schema.rs
  • Foundational RFCs (in the issue tracker): #5574 (microkernel), #5576 (docs), #5577 (governance), #5579 (CI), #5615 (AI co-authorship), #5653 (zero-compromise).

TL;DR โ€” the four sentences that matter. Build a tiny trait crate that defines Provider, Channel, Tool, and Memory, and refuse to let the runtime depend on anything else. Default to Supervised autonomy with sandbox-on, workspace-bounded, and HMAC-receipted tool calls โ€” make YOLO loud and obvious. Ship one OpenAI-compatible adapter and you've shipped twenty providers. Generate every reference doc from the code, gate every breaking change behind a stability tier, and reject any PR that mixes concerns.


If you found this helpful, let me know by leaving a ๐Ÿ‘ or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! ๐Ÿ˜ƒ

Top comments (0)