Source repository: https://github.com/zeroclaw-labs/zeroclaw (Rust, dual-licensed MIT OR Apache-2.0, Rust 2024 edition).
This is an opinionated, end-to-end walkthrough of how ZeroClaw is engineered, written so you can use it as a blueprint for building your own self-hosted, multi-channel, multi-provider AI agent runtime. It covers the philosophy, the architecture, the core abstractions, the security model, the agent loop, the SOP engine, the memory layer, operational concerns, and the build/governance discipline that holds the whole thing together.
๐ Table of Contents
- ๐งฉ 1. What ZeroClaw is in one paragraph
- ๐งญ 2. The four design opinions (the only ones that matter)
- ๐๏ธ 3. The architecture in one picture
- ๐ 4. The five core traits โ the entire ABI
- ๐ 5. The agent loop
-
๐ 6. The six-layer security model
- ๐ Layer 1: Channel pairing & access control
- ๐๏ธ Layer 2: Autonomy level (the coarse knob)
- ๐ Layer 3: Workspace boundary & path rules
- ๐ Layer 4: Shell command policy
- ๐ฆ Layer 5: OS-level sandbox (mechanism, not policy)
- ๐ Layer 6: Tool receipts (cryptographic audit log)
- ๐ง Additional gates (worth copying)
- โ The "what about blocked calls?" question
- ๐ค 7. Provider strategy โ supporting 20+ LLMs without exploding
- ๐พ 8. Memory โ durable, queryable, GDPR-compliant
- โ๏ธ 9. The SOP engine โ automation that isn't just chat
- ๐ก 10. Channels โ one orchestrator, 30+ adapters
- ๐ฅ๏ธ 11. Operational surfaces
- ๐ 12. Configuration โ one TOML, every key documented
- ๐ท๏ธ 13. Stability tiers โ how a young project ships breaking changes safely
- โ๏ธ 14. Risk tiers โ how PRs are reviewed
- ๐ 15. Code conventions โ non-obvious rules that protect quality
- ๐ง 16. The toolchain & tooling stack
- ๐ 17. Generating reference docs from code (so they never drift)
- ๐๏ธ 18. Governance โ how decisions get made
- โ ๏ธ 19. Anti-patterns to avoid (a checklist)
-
๐บ๏ธ 20. A practical roadmap for building your own ZeroClaw-shaped system
- ๐ฑ Phase 0 โ Foundations (week 0)
- ๐ Phase 1 โ The kernel ABI (week 1)
- ๐ Phase 2 โ One vertical slice (weeks 2โ3)
- ๐ Phase 3 โ Security model (week 4)
- ๐ฆ Phase 4 โ Sandbox (week 5)
- ๐ Phase 5 โ Scale the edges (weeks 6โ10)
- ๐ Phase 6 โ Production posture (weeks 10โ14)
- ๐ Phase 7 โ Long tail
- ๐จ 21. Things that are easy to underestimate
- ๐ฑ 22. The minimal viable lattice (what you must have)
- ๐ 23. References (read these next)
๐งฉ 1. What ZeroClaw is in one paragraph
ZeroClaw is a single Rust binary you configure and run. It connects to ~20 LLM providers (Anthropic, OpenAI, Ollama, Gemini, Bedrock, OpenRouter, and any OpenAI-compatible endpoint), reaches the world through 30+ messaging channels (Discord, Telegram, Matrix, email, voice, webhooks, CLI), and acts through tools (shell, browser, HTTP, hardware GPIO, custom MCP servers). Everything runs on the user's machine with their keys, in their workspace. There is no SaaS, no telemetry, no license server.
The slogan โ "You own the agent. You own the data. You own the machine it runs on." โ is the foundational constraint that every other design decision falls out of.
๐งญ 2. The four design opinions (the only ones that matter)
Internalize these before writing a line of code. They are the project's philosophy in priority order:
- ๐ You own it. Local-first. Binary on your box, keys in your config, history in your DB. Pull the power cord and it stops cleanly. No cloud tenancy.
- ๐ก๏ธ Security-first, with escape hatches. Default autonomy is
Supervised(medium-risk asks, high-risk blocks). Sandboxes, command policies, workspace boundaries, and cryptographic tool receipts are on by default. A loud, obviously-namedYOLOpreset lets dev users opt out. - ๐ชถ Minimal โ in binary size, dependencies, and surface area. A microkernel architecture with feature flags so users only ship what they use. Terse, localized tool descriptions; no hidden personality prompts.
- ๐ Provider-agnostic. The brain is pluggable. Switching providers is a one-line config change. Fallback chains keep things alive when an upstream flakes.
๐ซ If your design conflicts with one of these, the design loses. Write them down at the top of your
PHILOSOPHY.mdbefore anything else.
๐๏ธ 3. The architecture in one picture
Trait-driven, layered, microkernel-shaped:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ channels gateway ACP โ
โ (30+ adapters) (REST/WS) (JSON-RPC) โ
โ โ โ
โ ZeroClaw runtime โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โ agent โ security โ SOP โ โ
โ โ loop โ policy โ engine โ โ
โ โโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโ โ
โ โ โ โ โ
โ providers tools memory โ
โ (Anthropic, (shell, (SQLite, โ
โ OpenAI, browser, embeddings) โ
โ Ollama, HTTP, โ
โ ~20 more) hardware) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
There are three structural layers:
| Layer | Crates | Job |
|---|---|---|
| Edge |
zeroclaw-channels, zeroclaw-providers, zeroclaw-gateway, zeroclaw-tools
|
Talk to the outside world (LLMs, chat platforms, HTTP, FS, devices) |
| Core |
zeroclaw-runtime, zeroclaw-config, zeroclaw-api
|
Agent loop, policy, schema, the kernel ABI |
| Support |
zeroclaw-memory, zeroclaw-infra, zeroclaw-macros, zeroclaw-tool-call-parser, zeroclaw-plugins, zeroclaw-hardware, zeroclaw-tui
|
Cross-cutting utilities (tracing, derive macros, parsing, plugins, hardware HAL) |
The kernel rule. zeroclaw-runtime depends only on the trait crate zeroclaw-api, never on concrete provider/channel/tool crates. Everything is wired in via factory functions at startup. You can rip out Discord without recompiling the agent loop.
๐ 4. The five core traits โ the entire ABI
crates/zeroclaw-api/src/ defines the entire extension surface. Every other crate depends on this one. No implementations live here, only traits and shared data types. This is what keeps the kernel tiny and testable.
4.1 ๐ค Provider โ an LLM backend
#[async_trait]
pub trait Provider: Send + Sync {
fn name(&self) -> &str;
async fn chat(&self, req: ChatRequest<'_>) -> Result<ChatResponse>;
async fn stream(&self, req: ChatRequest<'_>) -> Result<BoxStream<'_, StreamEvent>>;
fn supports_tool_calls(&self) -> bool;
fn supports_streaming(&self) -> bool;
// ... capability flags for thinking, multimodal, JSON mode, etc.
}
Key support types: ChatMessage, ToolCall, ToolResultMessage, ConversationMessage, ChatResponse, TokenUsage, and StreamEvent (a stream of TextDelta, Thinking, ToolCallDelta, Done).
4.2 ๐ก Channel โ a messaging platform
#[async_trait]
pub trait Channel: Send + Sync {
fn name(&self) -> &str;
async fn send(&self, msg: SendMessage) -> Result<()>;
async fn poll(&self) -> Result<Vec<ChannelMessage>>; // for pull-based
async fn run(&self, tx: mpsc::Sender<ChannelMessage>) -> Result<()>; // for push-based
fn supports_draft_updates(&self) -> bool;
async fn ask_approval(&self, req: ChannelApprovalRequest) -> Result<ChannelApprovalResponse>;
}
Approval lives on the channel: Telegram uses inline keyboard buttons, Slack uses Block Kit, Discord/Signal/WhatsApp embed a short token in the prompt and wait for <token> approve|deny|always. The Channel trait is the abstraction that makes that switch transparent to the runtime.
4.3 ๐ ๏ธ Tool โ an agent-callable capability
#[async_trait]
pub trait Tool: Send + Sync {
fn name(&self) -> &str;
fn description(&self) -> &str;
fn parameters_schema(&self) -> serde_json::Value;
async fn execute(&self, args: serde_json::Value) -> Result<ToolResult>;
fn spec(&self) -> ToolSpec { /* default impl */ }
}
ToolResult { success: bool, output: String, error: Option<String> } โ that's the whole contract. Everything from shell, web_fetch, pdf_read, git_operations, mcp_tool to gpio_write is a Tool.
4.4 ๐พ Memory โ a conversation/fact store
#[async_trait]
pub trait Memory: Send + Sync {
fn name(&self) -> &str;
async fn store(&self, key: &str, content: &str, category: MemoryCategory, ...) -> Result<()>;
async fn search(&self, query: &str, ...) -> Result<Vec<MemoryEntry>>;
async fn export(&self, filter: ExportFilter) -> Result<Vec<MemoryEntry>>; // GDPR data portability
// ...
}
MemoryCategory is Core | Daily | Conversation | Custom(String). Entries carry namespace, importance, superseded_by to support multi-agent isolation, prioritized retrieval, and conflict resolution.
4.5 โ๏ธ RuntimeAdapter and Peripheral โ execution-environment and hardware
RuntimeAdapter lets you port the agent to a new execution environment (native, Docker, Cloudflare Workers, embedded). It declares capability flags (has_shell_access, has_filesystem_access, supports_long_running, memory_budget) plus a single behavioural hook build_shell_command. The agent loop queries those flags and disables tools the runtime can't honour.
Peripheral lets a hardware board (STM32 over serial, RPi GPIO via gpiod) expose its capabilities as Tools. Connect โ use โ disconnect lifecycle, with a health_check probe.
๐ 5. The agent loop โ what actually happens between "user sends message" and "agent replies"
crates/zeroclaw-runtime/src/agent/loop_.rs (~7,500 lines โ yes, really, agent loops grow). Conceptually:
1. Channel adapter receives a platform-native event
โโ decode โ canonical envelope
โโ dedup (ignore retries/restarts)
โโ pair-check (allowed_users, allowed_chats, IP allowlist)
2. Runtime: deliver_message(envelope)
โโ Load conversation history + retrieved memory facts
โโ Apply tool filtering (built-in always; MCP gated by `tool_filter_groups`)
โโ Compose system prompt + history + tool specs
3. Provider: chat(stream=true)
โโ Stream TextDelta chunks โ relay to channel as draft updates (if supported)
โโ Stream Thinking chunks โ never sent to model again, only to UI
โโ Stream ToolCall
4. SecurityPolicy.validate_tool_call(name, args, risk)
โโ Allowed โ run
โโ Blocked โ return ToolResult::Err to model, model can react
โโ Approval required โ channel.ask_approval(prompt)
โโ Telegram: inline keyboard
โโ Slack: Block Kit
โโ Discord/WhatsApp: token reply
โโ CLI: inline prompt
โโ ACP: session/update kind="approval_request"
5. Tool.execute(args)
โโ Run inside OS-level sandbox (Landlock / Bubblewrap / Seatbelt / Docker)
โโ Compute receipt = HMAC-SHA256(session_key, name||args||result||ts)
โโ Append receipt token to result text
6. Provider: chat(history + tool_result) โ final TextDelta stream
7. Channel: send(final reply, threaded if applicable)
8. Memory: persist conversation + tool calls + receipts
Critical properties to copy:
-
Streaming is end-to-end. Don't buffer the whole LLM response. Channels that report
supports_draft_updates()get incremental edits to a single sent message; others flush on stream end. -
Tool calls are mid-stream. The model can emit a
ToolCallwhile still generating text. Pause the stream, validate, invoke, resume. -
Iteration cap. ZeroClaw caps tool-use iterations per user message at
DEFAULT_MAX_TOOL_ITERATIONS = 10to prevent runaway loops. Always have one. -
Cost budgeting. A
ToolLoopCostTrackingContextis threaded through (viatokio::task_local!) so a per-session budget can break the loop early. -
Cancellation tokens everywhere.
tokio_util::sync::CancellationTokenis on theSendMessageand threaded through tool execution so you can interrupt mid-flight without leaving zombies. -
Task-locals for cross-cutting state.
TOOL_LOOP_THREAD_ID,TOOL_LOOP_SESSION_KEY,TOOL_CHOICE_OVERRIDEaretokio::task_local!s set by the agent loop, read by tools and providers without polluting function signatures.
๐ 6. The six-layer security model
This is the most important part to get right. ZeroClaw composes six independent layers, outer to inner โ each one is sufficient on its own as a defence; together they're defence-in-depth.
๐ Layer 1: Channel pairing & access control
Done at the channel adapter, before the runtime ever sees the event:
allowed_users, allowed_chats, IP allowlists for webhooks, device pairing.
๐๏ธ Layer 2: Autonomy level (the coarse knob)
[autonomy]
level = "supervised" # "read_only" | "supervised" | "full"
-
read_onlyโ observe only.file_read,memory_search,http GET,web_search. Public Q&A agents. -
supervised(default) โ low-risk runs, medium asks, high blocks. Approval requests timeout at 300s and treat as denials. -
fullโ no approval gates, but workspace, sandbox, and command policies still enforce. For dev/CI/SOPs.
Per-tool overrides (these are the practical knob):
[autonomy.auto_approve] tools = ["browser_open", "http"] # always ok
[autonomy.always_ask] tools = ["file_write", "shell"] # always ask
[autonomy.never_allow] tools = ["browser_automation"] # always block
Per-channel overrides let a public Bluesky channel run read_only while a private CLI runs supervised.
๐ Layer 3: Workspace boundary & path rules
[autonomy]
workspace_only = true
forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh", "~/.aws"]
forbidden_paths always blocks regardless of workspace_only. Symlink resolution happens before enforcement.
๐ Layer 4: Shell command policy
[autonomy]
allowed_commands = ["git", "cargo", "grep", "find", "ls", "cat"]
forbidden_commands = ["shutdown", "reboot", "mkfs"]
A pattern-matching validator runs before the command hits the shell โ looks for dangerous flags, pipelines, argument shapes (rm -rf /, :(){ :|: & };:, etc.). Blocks surface as a tool error the model can react to.
๐ฆ Layer 5: OS-level sandbox (mechanism, not policy)
Auto-detected at startup:
| Platform | Preferred order |
|---|---|
| Linux | Landlock (kernel 5.13+) โ Bubblewrap โ Firejail โ Docker โ none |
| macOS | Seatbelt (native) โ Docker โ none |
| Windows | AppContainer (experimental) โ Docker โ none |
The sandbox confines filesystem (workspace + /usr + /lib read-only + explicit extras), network (allow-domains list optional), env (only shell_env_passthrough passes through; *_TOKEN/*_SECRET/*_PASSWORD patterns never), and process limits (CPU/memory/subprocesses/wall-time).
๐ Layer 6: Tool receipts (cryptographic audit log)
This is the most novel part. Every tool invocation produces an HMAC-SHA256 digest:
receipt = HMAC-SHA256(ephemeral_session_key, tool_name || args || result || timestamp)
appended as: [receipt: zc-receipt-<timestamp>-<base64url-digest>]
The model sees receipts in its context but cannot forge them โ it doesn't have the key. This closes the deniability gap: the model cannot claim it ran a tool it didn't, and cannot fabricate a tool result. Receipts are chained (each includes the hash of the previous), so tampering invalidates the rest of the log. Cost: <1 ms per call, no new external deps.
Cite the paper: Basu, A. (2026). "Tool Receipts, Not Zero-Knowledge Proofs." The point is that you don't need ZK proofs for an in-process agent โ a symmetric MAC with an ephemeral per-session key is enough.
๐ง Additional gates (worth copying)
-
OTP gating โ
[security.otp] gated_actions = ["shell", "browser"]requires a TOTP before each listed action. Useful for remote admin. -
Emergency stop โ
zeroclaw estophalts all in-flight tools. Resuming requires OTP if[security.estop] enabled = true. - Prompt-injection guard โ scans model output for known injection patterns before validation.
-
Leak detector โ scans outbound messages for secret-shaped tokens (API keys, private keys) and blocks. The detector is configured to pass through
zc-receipt-*tokens. - Pairing guard โ device pairing for channel auth; prevents stolen creds from being used on a new device.
โ The "what about blocked calls?" question
A blocked call is not silent. The validator returns an error โ runtime wraps it as ToolResult::Err โ the model sees Error: Shell command blocked by policy: forbidden pattern 'rm -rf /' and can apologise/retry/escalate. If a channel restricts the toolset (tools_allow = [...]), the tool is simply not advertised to the model in the first place โ it never sees a tool it can't call.
๐ค 7. Provider strategy โ supporting 20+ LLMs without exploding
Look at crates/zeroclaw-providers/src/. The trick:
anthropic.rs openai.rs ollama.rs
gemini.rs bedrock.rs azure_openai.rs
copilot.rs glm.rs kilocli.rs
โ
compatible.rs โ single OpenAI-compatible impl
โ
Groq, Mistral, xAI, DeepSeek, Together, Fireworks,
Perplexity, Cohere, Moonshot, Venice, OpenRouter, ...
One compatible.rs adapter implements the OpenAI Chat Completions schema and is reused for ~20 providers โ most "new providers" are a config entry, not a new file. Only providers with genuinely different protocols (Anthropic Messages API, Gemini, Bedrock SigV4) get their own file.
Wrappers on top of providers:
-
router.rsโ multi-provider router that routes by task hint (e.g.reasoningโ DeepSeek-R1,chatโ Sonnet,visionโ Gemini). -
reliable.rsโ fallback chain wrapper. Failover triggers: HTTP 5xx, rate-limit, timeout, schema-validation failure. -
schema.rsโ JSON-schema cleaner that normalizes tool schemas per provider. Gemini rejectsminLength,pattern,$ref,additionalProperties; Anthropic doesn't resolve$ref; OpenAI is most permissive. The cleaner has named strategies (Gemini,Anthropic,OpenAI,Conservative) and gives you cross-provider tool portability for free. -
Tool-call parser (
zeroclaw-tool-call-parser) โ normalizes the model side. OpenAI-styletool_callsJSON, Anthropic<tool_use>blocks, Qwen XML, Ollama function-call formats โ all parsed into a singleParsedToolCallshape. Supports a dispatcher trait pattern (XmlToolDispatcher,JsonToolDispatcher) that strips<think>...</think>reasoning blocks before parsing.
Reasoning preservation. Some providers (DeepSeek-R1, GLM-4.7, Kimi K2.5) reject tool-call history that omits the reasoning_content field. ZeroClaw stores it as opaque pass-through on ChatResponse and ConversationMessage::AssistantToolCalls โ you don't decode it, you just round-trip it.
๐พ 8. Memory โ durable, queryable, GDPR-compliant
crates/zeroclaw-memory/:
- Default backend: SQLite. Single-file, embedded, zero-ops.
-
Optional: PostgreSQL behind
--features memory-postgresfor multi-instance deployments needing shared concurrent writes (withpgvectorfor vector search). - Embeddings are pluggable (OpenAI, Ollama local, etc.).
-
Categories:
Core(long-term facts/preferences),Daily(session logs),Conversation(raw history),Custom(name)for user-defined buckets. - Memory consolidation runs on a schedule โ summarizes long conversations, extracts facts, marks superseded entries.
-
GDPR Art. 20 export is built into the trait via
ExportFilter. Namespace, session, category, time-range โ all filterable.
The memory layer is where you implement per-namespace isolation so multi-agent, multi-user deployments don't cross-contaminate.
โ๏ธ 9. The SOP engine โ automation that isn't just chat
Standard Operating Procedures live in crates/zeroclaw-runtime/src/sop/. The model's contribution is loose-but-bounded:
- Trigger sources: cron, MQTT, webhook, peripheral events, manual.
-
Execution modes:
-
Autoโ run all steps, no human. -
Supervised(default) โ approval before starting. -
StepByStepโ approval before each step. -
PriorityBasedโ Critical/High โ Auto, Normal/Low โ Supervised. -
Deterministicโ sequential, no LLM round-trips, step outputs piped as inputs to next step (with optional checkpoint approvals). This is the killer feature: when the model has already figured out a workflow, replay it deterministically and save 100% of the LLM cost.
-
- Concurrency: per-SOP concurrency limits + cooldown.
- Resumability: runs are durable; restart picks up where it left off.
If you build something similar, design the SOP type so the same definition can be run in any mode โ that's how Deterministic mode "graduates" workflows from LLM-driven to mechanical.
๐ก 10. Channels โ one orchestrator, 30+ adapters
crates/zeroclaw-channels/src/orchestrator/ is the part that's worth reading. The adapters themselves (Discord, Telegram, Matrix, Slack, Signal, IRC, Bluesky, Nostr, WhatsApp, WeChat, Lark, DingTalk, LINE, IMessage, Email, Voiceโฆ) are mostly mechanical platform wiring. The orchestrator handles the cross-cutting concerns that every channel needs:
- Inbound dedup โ same message arriving twice from a retry.
- Draft updates โ single sent-message edits as the LLM streams.
- Multi-message splits โ long replies into sequential messages.
-
Threading โ
thread_ts(Slack) / thread ID (Discord);interruption_scope_id(distinct fromthread_ts!) for cancellation grouping. -
Media pipeline โ
MediaAttachments onChannelMessageflow into transcription (audio) / vision (images) / extraction (PDFs) before the message reaches the agent loop. - Approval routing โ channel-appropriate UX (buttons vs token replies vs ACP RPC).
- Link enrichment โ fetches OpenGraph metadata for URLs in inbound messages.
Build the orchestrator first, then add channels. It's the part that pays for itself with every new adapter.
๐ฅ๏ธ 11. Operational surfaces
ZeroClaw exposes the runtime through five distinct surfaces, in increasing intimacy:
-
CLI (
zeroclaw agent,zeroclaw onboard,zeroclaw service install,zeroclaw estop) โ aclap-derived command tree, with reference docs generated from the derives viacargo mdbook refs. The docs cannot drift from the binary. -
HTTP/WebSocket gateway (
zeroclaw-gateway) โ REST for sessions/memory/status/cron, WS for streaming. Default-binds to127.0.0.1; public bind requires explicit[gateway.allow_public_bind = true]. Pairing required by default. - Web dashboard โ chat, memory browser, config editor, cron management, tool inspection, served by the gateway.
-
ACP (Agent Client Protocol) โ JSON-RPC 2.0 over stdio, for IDE/editor integration.
session/updateevents withkind = "approval_request"are how IDEs render approvals. -
Service registration โ
zeroclaw service installwrites a unit file for systemd, launchctl, or Windows Service. The service manages its own logs, restarts, and (on Linux) sandbox negotiation.
Plus, for distribution:
-
Tauri desktop app (
apps/tauri/) โ wraps the runtime + dashboard in a native shell. -
Docker Compose (
docker-compose.yml+ multipleDockerfile.*). -
Kubernetes (
deploy-k8s/). -
Firmware (
firmware/) โ stubs for STM32/RPi peripheral boards. -
install.sh โ single-script bootstrap that asks "prebuilt vs source?", handles
--minimal(~6.6 MB kernel-only), feature-flag selection, and ends withzeroclaw onboard.
๐ 12. Configuration โ one TOML, every key documented
A single file at ~/.zeroclaw/config.toml. Layout:
[providers.models.default]
provider = "openrouter"
model = "anthropic/claude-sonnet-4-6"
api_key = "${OPENROUTER_API_KEY}"
[providers.fallback]
chain = ["default", "ollama-local"]
[autonomy]
level = "supervised"
workspace_only = true
allowed_commands = ["git", "cargo", "grep"]
forbidden_paths = ["/etc", "~/.ssh"]
[security.sandbox]
backend = "auto" # landlock | bubblewrap | firejail | docker | seatbelt | noop
network = "allowed-domains"
allowed_domains = ["api.openai.com", "api.anthropic.com"]
[agent.tool_receipts]
enabled = true
show_in_response = false
inject_system_prompt = true
[channels.discord]
token = "${DISCORD_TOKEN}"
allowed_users = ["123456789"]
autonomy_level = "supervised" # per-channel override
[gateway]
bind = "127.0.0.1:8080"
allow_public_bind = false
[memory]
backend = "sqlite"
path = "${ZEROCLAW_HOME}/memory.db"
embedder = "ollama:nomic-embed-text"
Two principles to copy verbatim:
-
Configurablederive macro.crates/zeroclaw-macros/exposes a derive that generates schema entries from struct definitions. The full config reference (docs/book/src/reference/config.md) is generated from the live schema. Docs cannot drift from code. -
Secrets: never in-line, always
${ENV_VAR}interpolation, with an encrypted local secrets store (ChaCha20-Poly1305) for things you don't want in the environment.
๐ท๏ธ 13. Stability tiers โ how a young project ships breaking changes safely
Every workspace crate carries a stability tier (per RFC #5574 โ Microkernel transition):
| Tier | Meaning |
|---|---|
| โ Stable | Covered by breaking-change policy. Major-version bump required. |
| ๐ถ Beta | Breaking changes allowed in MINOR with changelog notes. |
| ๐ฌ Experimental | No stability guarantee. Iterate freely. |
Tiers are promoted, never demoted, by deliberate team decision. Today's snapshot:
| Crate | Tier | Stable target |
|---|---|---|
zeroclaw-api |
Experimental | v1.0.0 |
zeroclaw-config |
Beta | v0.8.0 |
zeroclaw-providers |
Beta | โ |
zeroclaw-memory |
Beta | โ |
zeroclaw-tool-call-parser |
Beta | v0.8.0 |
zeroclaw-channels |
Experimental | v1.0.0 (plugin migration) |
zeroclaw-tools |
Experimental | v1.0.0 (plugin migration) |
zeroclaw-runtime |
Experimental | โ |
Copy this idea. It's how you ship publicly, get users, and still iterate.
โ๏ธ 14. Risk tiers โ how PRs are reviewed
The contribution doc carves changes into three risk classes. Use them for your review checklist:
- ๐ข Low: docs / chore / tests-only.
- ๐ก Medium: most
crates/*/src/**behaviour changes without boundary or security impact. - ๐ด High: anything in
runtime/security/,gateway/,tools/,.github/workflows/, access-control boundaries.
When uncertain, classify as higher risk. CI gating, required reviewers, and approval thresholds are tied to the tier.
๐ 15. Code conventions โ non-obvious rules that protect quality
Lifted from AGENTS.md. These are the rules that matter:
- ๐ Trait-driven extension. New provider/channel/tool = implement a trait, register in factory. Never patch the kernel.
- ๐ซ No speculative config. Don't add a config key without a concrete use case landing in the same PR.
- ๐ Localize user-facing output. Every CLI message, tool description, onboarding prompt uses
fl!()/ Fluent strings. Bare string literals in user-visible paths are a CI failure. - ๐ฌ๐ง English log/panic messages.
tracing::events and panics stay in English with stableerror_keyfields. Translation breaks log aggregation. - โ No
unwrap()/expect()in production paths. Propagate the error or document the invariant that makes panic impossible (with a comment). - ๐๏ธ No dead-code suppression. Don't
#[allow(dead_code)]or_varto silence the compiler. Delete the code, wire it into behaviour, or open a follow-up issue. Reserve underscore names for required-but-unused trait/callback parameters only. - ๐ฏ One concern per PR. Mixed feature+refactor+infra patches are rejected. Stacked PRs declare
Depends on #...; replacements declareSupersedes #.... - ๐ Conventional commits + size labels (
size: XS/S/M). Small PRs, always. - ๐ Privacy. Never commit personal data, real identities, or production secrets โ even in examples or test fixtures. There's a "neutral-placeholder palette" in
docs/.../privacy.md. - ๐ค AI-coding co-authorship is welcome but governed by RFC #5615 โ disclose the assistant, the human reviews, the human signs the commit.
๐ง 16. The toolchain & tooling stack
Concretely, this is the supporting cast you'll want:
| Concern | Tool |
|---|---|
| Async runtime | tokio |
| HTTP | reqwest |
| WebSocket | tokio-tungstenite |
| Serialization |
serde, serde_json, toml
|
| TLS | rustls |
| Crypto |
chacha20-poly1305 (secrets), ring/hmac+sha2 (receipts) |
| SQLite | rusqlite |
| CLI parsing |
clap (derive) |
| Terminal UI |
ratatui, dialoguer
|
| Logging/metrics |
tracing + (optional) Prometheus + OpenTelemetry |
| Localization | fluent |
| Build profile | LTO + single codegen unit for release; faster CI variant |
| Lint/format |
cargo fmt, cargo clippy -D warnings
|
| Lockfile-on-CI |
cargo build --locked, cargo test --locked
|
| Pre-PR runner |
just ci (alias for fmt-check + lint + test) |
| Docs site |
mdbook with custom cargo mdbook refs/sync plugins |
| Fuzzing | cargo fuzz |
| Benchmarks |
criterion (under benches/) |
| Cargo policy |
cargo-deny (deny.toml) โ license, advisory, banned-deps audit |
| Release automation |
release-plz (release-plz.toml) |
| Reproducible env |
flake.nix (Nix), .envrc (direnv), .actrc (act) |
| Editor integration |
.vscode/, .gemini/, .claude/skills/ for agents |
๐ 17. Generating reference docs from code (so they never drift)
A pattern worth stealing wholesale:
-
docs/book/src/reference/cli.mdis generated from theclapderives (cargo mdbook refs). -
docs/book/src/reference/config.mdis generated from the JSON schema produced by theConfigurablederive macro. - The rustdoc-as-website page (
api.md) is rebuilt fromcargo doc --no-deps. - Translations in the docs book are seeded by AI fill (
cargo mdbook synccalls Anthropic whenANTHROPIC_API_KEYis set), then human-reviewed via.pofiles.
Net effect: no handwritten reference page can describe a feature that doesn't exist. Wire your docs build into CI from day one.
๐๏ธ 18. Governance โ how decisions get made
-
RFC process for substantive changes (see
docs/book/src/contributing/rfcs.md). Foundational ratified RFCs (#5574 microkernel, #5576 docs, #5577 governance, #5579 CI, #5615 AI co-authorship, #5653 zero-compromise error handling) read like a constitution. - Two-thirds majority for core-team votes (RFC #5577).
-
Public transparency documents under
docs/book/src/foundations/are protected files โ agent skills know not to move them. - Trademark policy carved out: code is MIT-or-Apache; the name and logo are trademarks of ZeroClaw Labs. Common pattern for projects that want OSS code without name impersonation.
โ ๏ธ 19. Anti-patterns to avoid (a checklist)
Straight from AGENTS.md, these are the mistakes you don't have to discover yourself:
- โ Adding heavy dependencies for minor convenience.
- โ Silently weakening security policy or access constraints.
- โ Adding speculative config/feature flags "just in case".
- โ Mixing massive formatting-only changes with functional changes.
- โ Modifying unrelated modules "while here".
- โ Bypassing failing checks without explicit explanation.
- โ Hiding behaviour-changing side effects in refactor commits.
- โ Suppressing unused production code with
_prefixes or#[allow(dead_code)]. - โ Leaving
unwrap()/expect()in production paths. - โ Including personal/identifying info in test data, examples, or commits.
- โ Calling concrete provider/channel/tool types from the kernel โ go through the trait.
๐บ๏ธ 20. A practical roadmap for building your own ZeroClaw-shaped system
If you're starting fresh, do it in this order. Don't skip steps; each one reveals constraints for the next.
๐ฑ Phase 0 โ Foundations (week 0)
- Pick a language with strong async, FFI, and a trait/interface system. (Rust is what ZeroClaw uses; Go works; TypeScript/Node works for v0 but won't reach the same binary-size/perf bar.)
- Write your
PHILOSOPHY.mdโ three or four opinions, prioritized. - Set up the workspace:
cargo workspace(or equivalent monorepo), with one traits crate (*-api) and one runtime crate. - Wire CI from commit #1: format + lint + test + lockfile + license/advisory audit.
๐ Phase 1 โ The kernel ABI (week 1)
- Define
Provider,Channel,Tool,Memory,RuntimeAdaptertraits in the API crate. Resist adding implementations here. - Write the data types:
ChatMessage,ToolCall,ToolResult,ChannelMessage,SendMessage,MemoryEntry,StreamEvent. Don't be afraid to evolve them โ they're the bedrock. - Stub a
noopimpl for each trait so the runtime can compile end-to-end.
๐ Phase 2 โ One vertical slice (weeks 2โ3)
- One provider (start with OpenAI-compatible โ it's the lingua franca; you'll get 20+ providers later for free).
- One channel (start with CLI โ no auth, no webhooks, instant feedback).
-
One tool (
echo, thenweb_fetch). - One memory backend (SQLite).
- Wire the agent loop end-to-end: receive โ load history โ call provider โ stream โ tool call โ invoke โ feed back โ reply โ persist.
- Streaming is hard. Get it right early.
๐ Phase 3 โ Security model (week 4)
- Implement autonomy levels (
ReadOnly/Supervised/Full) and the per-tool overrides. - Add the workspace boundary check and forbidden-paths list.
- Add the shell command policy with allow/deny lists and the pattern validator.
- Wire the channel approval prompt โ even if it's just CLI's inline prompt, the abstraction has to work for future channels.
- Implement tool receipts with HMAC-SHA256 and an ephemeral session key. This is cheap and changes the agent's accountability profile.
๐ฆ Phase 4 โ Sandbox (week 5)
- Auto-detect Landlock on Linux, Seatbelt on macOS, Docker as universal fallback.
- Make the sandbox an opt-out (not an opt-in). Default-on,
noopfor YOLO mode.
๐ Phase 5 โ Scale the edges (weeks 6โ10)
- Add a second provider (the OpenAI-compat one โ confirm your trait is right).
- Add Anthropic (because its protocol is genuinely different โ confirm again).
- Add the schema cleaner when Gemini support arrives (it'll fail without it).
- Add fallback chain + router.
- Add channels in this order: webhook โ Discord โ Telegram โ Matrix โ email โ voice. Each one stresses the orchestrator differently.
๐ Phase 6 โ Production posture (weeks 10โ14)
- Service registration (systemd / launchctl / Windows Service).
- HTTP/WS gateway + web dashboard. Default-bind localhost, require explicit public-bind.
- Pairing flow + WebAuthn for the dashboard.
- Cron + SOP engine. Build the deterministic execution mode early โ it pays for itself when one workflow runs 1,000ร.
- Observability: tracing, optional Prometheus, OpenTelemetry export.
๐ Phase 7 โ Long tail
- ACP (JSON-RPC over stdio) for IDE integration.
- Hardware support behind a feature flag.
- Plugin system (WASM is the right answer if you want third-party tools).
- Tauri desktop app.
- Translations.
- Microkernel split: keep promoting Experimental crates โ Beta โ Stable.
๐จ 21. Things that are easy to underestimate
Read this list when you're about to ship "v0.1":
- Streaming + tool calls. The model will emit a tool call mid-stream. The channel may not support draft updates. Both have to work.
-
Approval UX per channel. Buttons in Slack, tokens in WhatsApp, RPC in ACP. The abstraction (
Channel::ask_approval) has to be channel-shaped, not runtime-shaped. - Iteration cap. A model in a tool-call loop can burn $100 in 30 seconds. Cap iterations and budget per session. Always.
- Sandbox on macOS. Seatbelt is fine but some Homebrew-linked binaries don't cooperate. Have Docker as a documented fallback.
- Schema cleaning per provider. Gemini rejects keywords OpenAI accepts. Do the cleaner once, centrally.
-
Reasoning content round-tripping. If you don't preserve
reasoning_contentopaquely, DeepSeek-R1 / GLM-4.7 will reject your tool-call follow-ups. - Receipts cohabit with the leak detector. Make sure your secret-redactor passes receipt tokens through.
-
Cancellation propagation. Cancellation tokens have to reach the HTTP client, the sandbox process, and the channel send. Threading them through
tokio::task_local!is cleaner than threading them through every signature. -
Conversation pruning. Histories grow unbounded. ZeroClaw has
history_pruner.rs,context_compressor.rs,fast_trim_tool_results,emergency_history_trim. Plan for this from day one โ token budgets are the real ceiling. -
Migration story. Schema versioning + migration is in
crates/zeroclaw-config/. Once users have configs, you can't break them silently. - Generated docs. If the docs are hand-written, they will drift. Generate the reference; it pays back tenfold.
๐ฑ 22. The minimal viable lattice (what you must have)
If you copy only one thing from ZeroClaw, copy this skeleton:
your-agent/
โโโ crates/
โ โโโ your-api/ # traits only โ Provider, Channel, Tool, Memory
โ โโโ your-runtime/ # agent loop + security + SOP
โ โโโ your-config/ # TOML schema + secrets
โ โโโ your-providers/ # implementations behind feature flags
โ โโโ your-channels/ # implementations behind feature flags
โ โโโ your-tools/ # implementations behind feature flags
โ โโโ your-memory/ # SQLite default
โโโ docs/book/ # mdbook with refs generated from clap + schema
โโโ install.sh # one-script install with --minimal / --source / --prebuilt
โโโ Justfile # fmt, lint, test, ci, docs, build
โโโ deny.toml # license + advisory audit
โโโ flake.nix # reproducible dev env
โโโ AGENTS.md # rules for AI coding assistants
โโโ CLAUDE.md # project-specific assistant guidance
โโโ .env.example # every supported provider listed (commented)
โโโ README.md # philosophy โ install โ quick start โ architecture
Everything else is iteration on this shape.
๐ 23. References (read these next)
In rough order of "highest signal first":
- The repo: https://github.com/zeroclaw-labs/zeroclaw
- Philosophy doc โ
docs/book/src/philosophy.md - AGENTS.md โ root of the repo
- Architecture overview โ
docs/book/src/architecture/overview.md - Request lifecycle โ
docs/book/src/architecture/request-lifecycle.md - Security overview โ
docs/book/src/security/overview.md - Tool receipts โ
docs/book/src/security/tool-receipts.md(and the cited paper) - Autonomy levels โ
docs/book/src/security/autonomy.md - Sandboxing โ
docs/book/src/security/sandboxing.md - The trait crate โ
crates/zeroclaw-api/src/{provider,channel,tool,memory_traits,runtime_traits,peripherals_traits,schema,agent}.rs - The agent loop โ
crates/zeroclaw-runtime/src/agent/loop_.rs - The dispatcher โ
crates/zeroclaw-runtime/src/agent/dispatcher.rs - The SOP engine โ
crates/zeroclaw-runtime/src/sop/{engine,types,condition,dispatch}.rs - The compatible-provider adapter โ
crates/zeroclaw-providers/src/compatible.rs - The router and fallback โ
crates/zeroclaw-providers/src/{router,reliable}.rs - The schema cleaner โ
crates/zeroclaw-api/src/schema.rs - Foundational RFCs (in the issue tracker): #5574 (microkernel), #5576 (docs), #5577 (governance), #5579 (CI), #5615 (AI co-authorship), #5653 (zero-compromise).
TL;DR โ the four sentences that matter. Build a tiny trait crate that defines Provider, Channel, Tool, and Memory, and refuse to let the runtime depend on anything else. Default to Supervised autonomy with sandbox-on, workspace-bounded, and HMAC-receipted tool calls โ make YOLO loud and obvious. Ship one OpenAI-compatible adapter and you've shipped twenty providers. Generate every reference doc from the code, gate every breaking change behind a stability tier, and reject any PR that mixes concerns.
If you found this helpful, let me know by leaving a ๐ or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! ๐
Top comments (0)