Truong Phung

Posted on Apr 28 • Edited on May 6

🦀 PicoClaw Deep Dive 🤖 — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

#ai #llm #tutorial #webdev

A comprehensive, actionable guide to the principles, techniques, and architecture behind sipeed/picoclaw — written so you can build a similar system from scratch.

🧩 What PicoClaw Is and Why It Matters
🎯 Design Philosophy
🏗️ High-Level Architecture
🔄 Core Concept #1 — The Agent Loop & Pipeline
🕹️ Core Concept #2 — Steering (Mid-Loop Message Injection)
🤝 Core Concept #3 — SubTurn (Hierarchical Sub-Agents)
💾 Core Concept #4 — Sessions & JSONL Persistence
🧭 Core Concept #5 — Rule-Based Model Routing
🪝 Core Concept #6 — The Hook System
📡 Core Concept #7 — Channel Abstraction (18+ chat platforms)
🤖 Core Concept #8 — Provider Abstraction (30+ LLMs)
🛠️ Core Concept #9 — Tools, Skills, and MCP
⚡ Resource-Efficiency Techniques (the <10MB secret)
📦 Cross-Compilation & Single-Binary Deployment
⚙️ Reference Configuration Schema
🗺️ Step-by-Step: Build Your Own PicoClaw-Style Agent
⚠️ Common Pitfalls & Lessons Learned
📖 Recommended Reading Path Through the PicoClaw Source

1. 🧩 What PicoClaw Is and Why It Matters

PicoClaw is a single-binary, Go-based personal AI agent that runs in under 10 MB of RAM on $10-class hardware (RISC-V SBCs, Raspberry Pi Zero, MIPS routers, Android via Termux, even old NanoKVM boards). It is heavily inspired by NanoBot, but rewritten "self-bootstrapped" in Go, with ~95% of the code generated by an agent under human review.

What makes it remarkable is not that it talks to LLMs — that's easy — but that it does so while being:

Property	PicoClaw	Typical Python AI stack
Memory footprint	< 10 MB	200 MB – 2 GB
Boot time	< 1 s on 0.6 GHz CPU	5–30 s
Distribution	One static binary	venv + dozens of wheels
Architectures	x86_64, ARM, ARM64, RISC-V, MIPS, LoongArch	mostly x86_64/ARM64
Channels	18+ (Telegram, Discord, WeChat, Slack…)	1–2 typically
LLM providers	30+ via unified interface	1–3 SDK-locked

The product is not "a chatbot." It is a portable agent runtime with first-class support for tools, MCP, sub-agents, multi-channel messaging, and provider routing.

2. 🎯 Design Philosophy

These are the principles that drive every design decision. Internalize these first; the code will then make sense.

2.1 🪶 Lean by default, extensible by interface

Choose Go because it produces small, statically-linked binaries with tiny runtime overhead, no GIL, and predictable memory. Wrap every variable subsystem (LLM, channel, tool, hook, registry) behind an interface so a feature can be added without touching the core loop.

2.2 📦 One binary, every architecture

A user deploying to a $10 RISC-V board should not have to think about Docker, Python versions, or shared libraries. make build-all produces binaries for Linux/amd64, ARM, ARM64, RISC-V, MIPS LE, LoongArch, Darwin ARM64, Windows, and NetBSD from one tree.

2.3 💾 Append-first persistence (JSONL)

Sessions and memories are stored as JSON Lines files with a sidecar .meta.json. Append-only is crash-safe, debug-friendly (tail -f), and trivially shippable. Schema migration happens lazily on read.

2.4 🗂️ Promote routing data to first-class fields

Channels do not bury chatId, senderId, and messageId inside generic metadata maps. Those are typed fields on InboundMessage. Routing, sessions, and hooks all rely on this contract.

2.5 🔍 Capabilities are discovered, not hardcoded

Each channel optionally implements MediaSender, TypingCapable, ReactionCapable, MessageEditor, WebhookHandler, HealthChecker. The manager probes via type assertions. Adding a new platform never touches the manager.

2.6 💰 Cheap-first, escalate when necessary

A rule-based classifier scores each turn 0..1 (token count, code blocks, recent tool calls, attachments, depth). Below threshold the request goes to a cheap "light" model. Above it, the heavy model. This alone cuts API spend dramatically for chatty workloads.

2.7 👁️ Observe everything, intercept rarely

Five synchronous hook points (before_llm, after_llm, before_tool, after_tool, approve_tool) are enough. Everything else is read-only event observation through an EventBus. Hooks can be in-process Go code or external processes via JSON-RPC over stdio.

2.8 🕹️ The user can change their mind mid-run

Users issue corrections. The agent loop polls a per-session steering queue after every tool call. New messages are injected before the next LLM turn; remaining queued tools are skipped with a "Skipped due to queued user message" result so the model knows what didn't run.

3. 🏗️ High-Level Architecture

                       ┌────────────────────────────────────────────┐
 18+ Chat Channels ─►  │  pkg/channels  (per-platform sub-packages) │
   (Telegram,          │  ─ BaseChannel, capability interfaces       │
    Discord, …)        │  ─ Manager: rate-limit, split, retry        │
                       └──────────────────────┬─────────────────────┘
                                              │  InboundMessage
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/bus  (typed event bus, in/out ctx)    │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/routing                                │
                       │  ─ Dispatch: which agent handles this?      │
                       │  ─ Classifier: complexity score 0..1        │
                       │  ─ Light/Heavy model decision               │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/session                                │
                       │  ─ SessionScope (agent/channel/account/dim) │
                       │  ─ JSONL backend + .meta sidecar            │
                       │  ─ Canonical key sk_v1_<sha256> + aliases   │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/agent  (the loop)                      │
                       │                                             │
                       │  pipeline_setup → pipeline_llm →            │
                       │  pipeline_execute (tools) → pipeline_finalize│
                       │                                             │
                       │  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
                       │  │ steering │  │ subturn  │  │  hooks   │  │
                       │  └──────────┘  └──────────┘  └──────────┘  │
                       │                                             │
                       │       ▲                          ▲          │
                       │       │ tools                    │ MCP       │
                       └───────┼──────────────────────────┼──────────┘
                               │                          │
                       ┌───────┴────────┐         ┌───────┴────────┐
                       │  pkg/tools     │         │  pkg/mcp       │
                       │  fs / shell /  │         │  isolated      │
                       │  hardware /    │         │  command       │
                       │  search ...    │         │  transport     │
                       └────────────────┘         └────────────────┘

                       ┌────────────────────────────────────────────┐
                       │  pkg/providers (factory + facades)          │
                       │  anthropic / openai_compat / azure /        │
                       │  bedrock / oauth / cli ...                  │
                       │  cooldown · ratelimiter · fallback ·        │
                       │  error_classifier                           │
                       └────────────────────────────────────────────┘

Three top-level binaries are produced from cmd/:

picoclaw — the agent itself (CLI + headless server)
picoclaw-launcher-tui — terminal UI launcher
membench — internal memory benchmark used to keep the <10MB promise honest

4. 🔄 Core Concept #1 — The Agent Loop & Pipeline

The pkg/agent package is where everything converges. The loop is split into four pipeline stages, each in its own file:

File	Stage	Job
`pipeline_setup.go`	Setup	Build prompt, load session history, resolve model, mount hooks
`pipeline_llm.go`	LLM Call	Call provider, stream tokens, parse tool calls and thinking blocks
`pipeline_execute.go`	Tool Execution	Run tool calls (possibly in parallel), enforce approvals, record results
`pipeline_finalize.go`	Finalize	Persist session, emit events, send outbound message, close turn

Around the pipeline are cross-cutting modules:

turn_coord.go — owns the per-turn state machine, decides light vs. heavy model, chooses provider candidates.
turn_state.go / turn_context.go — typed turn-scoped state.
context_manager.go / context_budget.go / context_usage.go — keep the message window inside the model's token limit; trim oldest, summarize, or drop based on budget.
prompt.go / prompt_contributors.go / prompt_turn.go — composable prompt builders. Each contributor adds a slice (system identity, tool list, memory, time, channel context).
eventbus.go / events.go — fan-out of every meaningful event (tool_exec_start, llm_request, turn_finished, …) to observers.
registry.go — agent registry; definition.go describes one agent (name, system prompt, tool set, models, light candidates).

Actionable patterns to copy

Make the loop a strict state machine, not a callback web. Each pipeline file exports a single function that takes and returns a turn state. Easier to test, to add tracing, and to inject hooks.
Have the agent definition be plain data. A Definition struct (pkg/agent/definition.go) is a name + system prompt + tool allow-list + provider candidates + light candidates. Loading from YAML/JSON becomes trivial.
Separate "what to send to the LLM" from "how to send it." Prompt contributors build the abstract message list; the provider facade (next section) maps it to vendor-specific JSON.
Track usage at the turn level. context_usage.go keeps token-in/token-out per turn so you can enforce per-turn budget caps and emit metering events without parsing logs.

5. 🕹️ Core Concept #2 — Steering (Mid-Loop Message Injection)

"The user can correct the agent at any moment. Make that a first-class concern."

pkg/agent/steering.go (and agent_steering.go) implements a per-session FIFO queue that the loop polls at four checkpoints:

Loop initialization (before first LLM call)
After each tool completes
After each non-tool LLM response
Before turn finalization

If a queued message exists at any of those points:

Any remaining tool calls in the current LLM response are skipped, each receiving the synthetic result "Skipped due to queued user message." so the model still understands what did/didn't run.
The queued message is appended to the conversation as a new user turn.
The loop re-enters the LLM stage.

Why this matters

Side-effect safety. A user yelling "don't send that email" actually stops the email if the previous tool was something else.
Compute savings. A planned batch of three 3–4s tool calls is ~10s of work avoided.
Model awareness. Skipping is announced via a tool-result message so the model can adapt instead of repeating the same plan.

Modes & limits

agentLoop.SetSteeringMode(agent.SteeringOneAtATime) // default: pop one per check
agentLoop.SetSteeringMode(agent.SteeringAll)        // drain whole queue at once

Hard cap: MaxQueueSize = 10 messages per session. Overflow returns an error on manual Steer() and a warning when an inbound channel-bus drain triggers it.

Public API to copy

// External: inject a correction
err := agentLoop.Steer(providers.Message{
    Role:    "user",
    Content: "actually, focus on X instead",
})

// External: nudge an idle session to continue
resp, err := agentLoop.Continue(ctx, sessionKey, channel, chatID)

Implementation notes

The queue is scoped by canonical session key. Different chats never bleed into each other.
Media references (media://...) survive steering — they're resolved in the normal pipeline before the provider call.
Inbound messages for a session that already has an active turn are automatically enqueued as steering rather than starting a competing turn.

6. 🤝 Core Concept #3 — SubTurn (Hierarchical Sub-Agents)

Sub-agents are isolated nested loops spawned by a parent turn. Defined in pkg/agent/subturn.go.

Properties

Property	Value
Max nesting depth	3
Max concurrent per parent	5 (semaphore-guarded, 30s timeout)
Default timeout	5 min (parent and child have independent timeouts)
Message buffer	50 messages per sub-turn (does not contaminate parent history)
Result delivery	async via `pendingResults` channel (16-message buffer)
Cancellation	hard abort cascades to children & grandchildren
`Critical: true`	survives parent completion and continues in background

When the parent polls results

Same checkpoints as steering — before every LLM call, after every tool call, before finalize. This keeps result handling deterministic without polling threads.

Why context derives from `context.Background()`, not the parent's `ctx`

So that an independent timeout on a child does not surprise it when the parent finishes early. If you want cascading cancellation for a particular sub-turn, the parent calls cancel() explicitly.

Pattern to copy

// inside parent agent loop
result, err := agent.SpawnSubTurn(ctx, agent.SubTurnSpec{
    AgentDef:   "researcher",
    Goal:       "Find primary sources for claim X",
    Critical:   false,
    Timeout:    2 * time.Minute,
    MaxHistory: 50,
})

Pitfalls

Orphan results. If the parent finishes before the child, the result is dropped (with a telemetry event). Either mark the child Critical: true or await it explicitly.
Buffer overflow. With 5 concurrent subs and a 16-slot result buffer, bursty completions can overflow — design subs to emit a single final result, not progress updates.

7. 💾 Core Concept #4 — Sessions & JSONL Persistence

pkg/session answers two questions: which messages share a conversation? and how is that conversation stored durably?

7.1 🪪 SessionScope — the structured identity of a conversation

type SessionScope struct {
    Version    string            // ScopeVersionV1
    AgentID    string            // routed agent
    Channel    string            // normalized channel name ("telegram")
    Account    string            // bot/account identifier
    Dimensions []string          // active partition dims, e.g. ["chat"]
    Values     map[string]string // concrete dim values
}

Default dimension set is ["chat"] → "one shared conversation per chat unless a dispatch rule overrides it." A dispatch rule can promote topic or sender into the dimension set to split or merge conversations.

7.2 🔑 Two key formats

Format	Example	Purpose
Canonical	`sk_v1_<sha256>`	Stable, opaque, the source of truth
Legacy	`agent:main:direct:user123`	Backward compat, resolved transparently

The JSONL backend resolves legacy aliases to canonical keys during reads and writes — so you can rename schemes without losing history.

7.3 📄 JSONL on disk

Per session:

<key>.jsonl — one providers.Message per line, append-only.
<key>.meta.json — { summary, created_at, updated_at, line_count, skip_offset, scope, aliases }.

Why two files: messages are append-only and crash-safe; metadata is overwritten under a per-shard mutex but small enough that a torn write is recoverable from the JSONL.

"Designed around append-first durability and stale-over-loss recovery."

7.4 📐 Allocator rules

The allocator turns inbound metadata into scope values:

space → <space_type>:<space_id>
chat → <chat_type>:<chat_id>
topic → topic:<topic_id>
sender → canonicalized through identity-link mappings (so that a user's Telegram ID and Slack ID map to the same logical sender)

Special case: Telegram forum topics append /<topic_id> to chat values when topic is not an explicit dimension — preventing topic cross-talk by default.

7.5 ⚡ Concurrency

A 64-shard mutex array (hash key → shard) serializes per-session writes without keeping an unbounded mutex map. This is a small but important pattern: lock striping is essentially free and fixes 99% of session-store contention bugs.

7.6 🔀 Migration

On startup the system attempts to migrate legacy JSON sessions into JSONL. If migration fails, it falls back to the legacy SessionManager rather than crash-looping the agent.

Actionable patterns

Make session keys content-addressed (sha256 over a canonical scope signature) so renaming dimensions doesn't break history.
Sidecar metadata is far simpler than embedding a header line in the JSONL.
Lock striping > one big mutex > one mutex per session. 64 shards is a good default.

8. 🧭 Core Concept #5 — Rule-Based Model Routing

pkg/routing is a two-stage pipeline:

Agent dispatch — Router picks which agent definition handles the message (rules over channel, sender, content, command-prefix, etc).
Model routing — once an agent is chosen, the RuleClassifier decides whether to use the agent's primary (heavy) model or a globally-configured cheap light model.

8.1 ⚙️ Configuration

{
  "routing": {
    "enabled": true,
    "light_model": "gemini-2.0-flash",
    "threshold": 0.35
  }
}

8.2 🔬 Features extracted per turn

The classifier is intentionally language-agnostic (no keyword lists), using five structural features:

Feature	What it measures
`TokenEstimate`	Approximate token count (CJK-aware rune counting)
`CodeBlockCount`	Number of fenced ` blocks in latest message
`RecentToolCalls`	Tool invocations in the last 6 history entries
`ConversationDepth`	Total history length
`HasAttachments`	Media references or recognized file extensions

8.3 ⚖️ Weighted scoring (clamped to [0,1])

Signal	Weight
Has attachments	1.00
Code block present	0.40
Tokens > 200	0.35
Recent tool calls > 3	0.25
Tokens > 50	0.15
Recent tool calls 1–3	0.10
Conversation depth > 10	0.10

With threshold 0.35, trivial chat stays cheap; code, attachments, or active tool use trigger heavy. Long plain prompts cross at the 200-token boundary.

8.4 🔌 Where it plugs in

pkg/agent/turn_coord.go swaps the candidate provider list to agent.LightCandidates when score < threshold; otherwise it uses the agent's primary candidate set unchanged. The agent doesn't know — it just receives a different ordered list of providers.

Pattern to copy

Routing rules are data, not code. Keep them in JSON. Hot-reload is then os.Stat + json.Unmarshal.
Each agent has both Candidates and LightCandidates — primary and cheap fallback chains. Routing only picks the chain; the fallback logic inside the chain is generic (next section).

9. 🪝 Core Concept #6 — The Hook System

Five synchronous hook points + arbitrary read-only observers. Defined in pkg/agent/hooks.go, hook_mount.go, hook_process.go.

9.1 🔗 The five synchronous points

Stage	Allowed actions
`before_llm`	`continue` · `modify` (rewrite request) · `abort_turn` · `hard_abort`
`after_llm`	`continue` · `modify` (rewrite response)
`before_tool`	`continue` · `modify` (rewrite args) · `respond` (skip exec, supply result) · `deny_tool`
`after_tool`	`continue` · `modify` (rewrite tool result)
`approve_tool`	allow / deny only

Everything else is observer-only events on the bus.

9.2 🔄 In-process vs out-of-process

In-process: Go function registered at startup. Zero serialization cost. Used for built-ins like rate-limit injectors, audit loggers, schema validators.

Out-of-process: any program speaking JSON-RPC over stdio. Spawned and supervised by HookManager. Use for Python ML reranking, secret scrubbers, external policy engines, even mocking tools during tests.

9.3 📡 JSON-RPC framing

`json
// Request from host → hook
{ "jsonrpc": "2.0", "id": 7, "method": "hook.before_tool", "params": { ... } }

// Hook → host
{ "jsonrpc": "2.0", "id": 7, "result": { "action": "respond", "result": "cached" } }

// Notification (one-way; observer events)
{ "jsonrpc": "2.0", "method": "hook.event", "params": {"Kind": "tool_exec_start"} }
`

Lifecycle: host calls hook.hello first to negotiate protocol version + capabilities.

9.4 ⚙️ Configuration shape

`json { "hooks": { "enabled": true, "observer_timeout_ms": 200, "interceptor_timeout_ms": 5000, "approval_timeout_ms": 30000, "builtins": { "audit_log": { "enabled": true, "priority": 10, "config": {} } }, "processes": { "policy_check": { "enabled": true, "priority": 100, "transport": "stdio", "command": ["python3", "/srv/policy.py"], "env": { "POLICY_FILE": "/etc/policy.yml" }, "observe": ["tool_exec_start"], "intercept": ["before_tool", "approve_tool"] } } } } `

9.5 📋 Hook ordering

In-process first → then by priority ascending → then by name. Deterministic and easy to reason about.

What hooks are NOT for

Sending messages to channels themselves (use the bus).
Suspending a turn pending human approval (state machine externally).
Full message interception across all platforms (channel-level concern).

Patterns to copy

Make the hook protocol versioned (hook.hello). It saves a major refactor 18 months later.
Observers run with a strict timeout (e.g. 200ms). Slow observers degrade quietly into "skipped" instead of stalling turns.
respond action lets a hook fake tool output. Cache, mock, override — without touching the registry.

10. 📡 Core Concept #7 — Channel Abstraction (18+ chat platforms)

pkg/channels is the textbook example of capability-based polymorphism in Go.

10.1 📜 The contract

Every platform sub-package embeds BaseChannel (base.go) and implements the minimum interface. Each platform self-registers a factory in init():

`go func init() { channels.Register("telegram", New) } `

registry.go is the single source of truth; the manager never imports specific platforms.

10.2 🔌 Capability interfaces (optional)

`go type MediaSender interface { SendMedia(...) error } type TypingCapable interface { ShowTyping(...) error } type ReactionCapable interface { React(...) error } type PlaceholderCapable interface { SendPlaceholder(...) (id string, err error) } type MessageEditor interface { Edit(...) error } type WebhookHandler interface { HandleWebhook(http.ResponseWriter, *http.Request) } type HealthChecker interface { Check(ctx context.Context) error } `

The manager probes channels with if c, ok := ch.(MediaSender); ok { ... }. Adding VoiceCapable to one platform doesn't change anyone else.

10.3 🗂️ First-class fields, not metadata bags

InboundMessage (in pkg/bus) hoists routing data to typed fields:

`go type InboundMessage struct { Peer Peer // platform + chat + topic MessageID string Sender SenderInfo // canonical identity ("telegram:42") Body string Media []MediaRef ReceivedAt time.Time } `

This is the contract that pkg/session.Allocator and pkg/routing.Router rely on. Put it in your design from day one — retrofitting is painful.

10.4 🎛️ Centralized orchestration in the manager

The manager (not the platform) owns:

Worker queue with rate limit per channel.
Outbound message splitting (split.go) — long replies are broken at sentence/word boundaries below the platform's per-message limit.
Retries with backoff on transient errors classified by errors.go / errutil.go.
Typing/reaction indicators as transparent decorations of long turns.

Platforms only know how to send a single chunk. Everything fancy happens above them.

10.5 🪪 Identity normalization

pkg/identity defines the canonical "platform:id" format and identity-link tables that collapse multi-platform users into one logical sender. This is what enables cross-channel memory and consistent routing.

Patterns to copy

Self-registration via blank-import side effects: the main binary just does _ "yourapp/channels/telegram" and the channel becomes available. No registry plumbing.
Capability interfaces beat optional methods on a god-interface. You will thank yourself when the 12th platform needs something weird.
Sentinel errors in errors.go so the manager can decide retry vs. drop without parsing strings.

11. 🤖 Core Concept #8 — Provider Abstraction (30+ LLMs)

pkg/providers is built around a factory + facade pattern.

11.1 📁 Layout

`plaintext
pkg/providers/
factory.go // registers and instantiates providers by name
factory_provider.go
cli_facade.go // unified for "CLI"-shaped providers
httpapi_facade.go // unified for HTTP-shaped providers
oauth_facade.go // unified for OAuth flows
cooldown.go // per-provider cool-down on auth/quota errors
ratelimiter.go // token-bucket per provider
fallback.go // chain-of-responsibility fallback to next candidate
error_classifier.go // network/auth/rate/server/unknown
types.go // Message, ContentBlock, ToolCall, Usage, …

anthropic/ // Anthropic Messages API
anthropic_messages/ // alt path, e.g. server-side tools
openai_compat/ // OpenAI + every API-compatible vendor
openai_responses_common/
azure/ // Azure OpenAI specifics
bedrock/ // AWS Bedrock
httpapi/ // generic HTTP fallback
oauth/ // device flows
cli/ // local CLI providers (Ollama-style)
common/ // shared message-utility helpers
messageutil/
protocoltypes/
`

11.2 🔌 The provider interface (conceptual)

A provider exposes:

Send(ctx, request) (response, error) (streaming via channel)
Capabilities() (tools? vision? thinking? context window? streaming?)
Name(), Model()

The agent loop never imports a specific provider — it picks one from a candidate list returned by the routing layer.

11.3 🛡️ Reliability stack (the part most projects miss)

When a provider call fails, the wrapper consults:

error_classifier — Auth? Rate-limit? Network blip? 5xx?
cooldown — if Auth/Quota, mark this provider unavailable for N minutes.
ratelimiter — token bucket to keep us under contractual TPM/RPM.
fallback — try next candidate in the chain (heavy → light, or primary → secondary key).

The agent never sees this — it sees one logical "send" that either returns a response or gives up after the chain is exhausted.

Patterns to copy

Provider config is protocol/model strings, e.g. "openai/gpt-5.4", "anthropic/claude-opus-4-7". Swap by editing config; no recompile.
Keep API keys in a separate .security.yml out of config.json. Different file permissions, easier to scrub in bug reports.
The classifier's job is to decide retry-or-not. Don't bake retry into each provider — it'll diverge.

12. 🛠️ Core Concept #9 — Tools, Skills, and MCP

Three layers of "things the agent can do beyond LLM calls":

12.1 🔧 Tools — built-in, in-process

pkg/tools/:

fs/ — read, write, list, glob.
shell.go (+ Unix/Windows variants) — process exec.
hardware/ — device interactions (USB, GPIO, camera; appropriate for SBCs).
integration/ — outbound HTTP, web search (DuckDuckGo, Brave, Tavily, Baidu).
shared/ — shared helpers used by multiple categories.
registry.go — registers tools; exposes Get(name), List(), schema.
toolloop.go — orchestrates tool execution within a single turn (parallel-safe, with approval hook integration).
search_tool.go — first-class tool selector for "find a tool that does X."
spawn.go / spawn_status.go — long-running child process management.

12.2 📚 Skills — installable plugins

pkg/skills/:

Two registry backends: clawhub_registry.go (custom hub), github_registry.go (any repo with the right manifest).
installer.go — fetch, verify, materialize on disk.
loader.go — load at runtime.
provider_factory.go — skills can ship with provider configurations.
search_cache.go — registry search results are cached.
config_bridge.go — skill config is merged into runtime config without leaking into the parent file.

A skill is essentially a packaged bundle of (tools | hooks | provider configs | prompts | docs) that can be installed by name and removed cleanly.

12.3 🔗 MCP — Model Context Protocol

pkg/mcp/:

manager.go — owns connections to MCP servers, exposes their tools/resources/prompts to the agent.
isolated_command_transport.go — spawns each MCP server in an isolated process, talks JSON-RPC over stdio. Prevents one buggy server from crashing the agent.
manager_test.go — coverage.

agent_mcp.go (in pkg/agent) wires MCP-discovered tools into the per-turn tool list. From the model's perspective, an MCP tool and a built-in tool are indistinguishable.

Patterns to copy

Built-in tools stay tiny and audited. Anything ambitious (browser automation, payments) lives behind MCP or skills.
MCP transport isolation is non-negotiable. Treat MCP servers as untrusted child processes.
Tools have schema, descriptions, and approval flags as data, not Go conditionals. Re-using the tool registry for skills and MCP just becomes a matter of listing them.

13. ⚡ Resource-Efficiency Techniques (the <10MB secret)

Hitting <10 MB on a 0.6 GHz RISC-V is engineering, not magic. The techniques used:

13.1 🐹 Choice of Go

Static linking: no shared-library footprint.
No JIT/interpreter. No Python startup cost.
-ldflags="-s -w" strips the symbol table and DWARF info from the binary (~30% size reduction).
-trimpath removes file system paths.
UPX (optional) for additional compression on flash-poor boards.

13.2 🧵 Minimal goroutine surface

A typical concurrent system spawns thousands of goroutines. PicoClaw keeps it tight: one per active channel listener, one per active turn, one per running sub-turn (capped at 5×N), one per spawned hook process, one per MCP transport. Goroutines are cheap but each carries a stack — keep them counted.

13.3 🚧 Bounded queues everywhere

Steering queue: 10
SubTurn result buffer: 16
Concurrent SubTurns per parent: 5
Channel manager worker queue: per-platform configured

Bounded queues turn "memory bug" into "rejected request" — you can monitor and tune.

13.4 🌊 Streaming, not buffering

LLM responses are streamed token-by-token. Tool outputs from spawned processes are streamed line-by-line. Big responses never sit fully in memory.

13.5 📄 JSONL append-only persistence

Constant-memory writes; reads are line-iterators. No O(n) JSON object reload on every turn.

13.6 😴 Lazy initialization

Channels, hooks, and skill registries initialize only when enabled in config. Disabled subsystems contribute zero allocations.

13.7 📊 `membench` as a regression gate

cmd/membench is shipped in the repo: a synthetic workload that measures peak RSS. If a PR busts the budget, CI catches it.

13.8 🔧 Architecture-aware patches

For MIPS LE on Ingenic X2600 / NaN2008 kernels, the Makefile patches the ELF e_flags at offset 36 after building. Without this, the kernel rejects the binary. Lesson: cross-compilation is not done when the linker exits.

14. 📦 Cross-Compilation & Single-Binary Deployment

14.1 🔨 The build matrix (`make build-all`)

OS	GOARCH	Notes
linux	amd64
linux	arm (`GOARM=7`)	Pi Zero 2 W (32-bit)
linux	arm64	Pi Zero 2 W (64-bit), most modern SBCs
linux	riscv64	LicheeRV-Nano, MaixCAM
linux	mipsle	post-build ELF flag patch for NaN2008 kernels
linux	loong64	LoongArch
darwin	arm64	Apple Silicon
windows	amd64
netbsd	amd64 / arm64

Specialized targets:

build-pi-zero → 32-bit + 64-bit Pi Zero 2 W bundle.
build-android-bundle → universal APK with JNI libs (the agent runs as a native service inside the APK).
build-whatsapp-native → adds the native WhatsApp bridge.
build-launcher / build-launcher-tui → web/TUI control panels.

14.2 🏷️ Version stamping

`shell go build -ldflags "-s -w \ -X main.version=$(VERSION) \ -X main.commit=$(COMMIT) \ -X main.date=$(DATE)" `

picoclaw --version then prints the stamped values — vital for triage.

14.3 🚀 Single-binary delivery

The launcher (web or TUI) is a tiny supervisor that:

Detects platform, picks the right binary.
Drops it into ~/.picoclaw/.
Spawns it and proxies a local browser to http://localhost:18800 for configuration.

End user double-clicks the launcher; agent runs. No package manager, no Docker, no Python.

15. ⚙️ Reference Configuration Schema

Annotated subset of config.example.json:

`jsonc
{
// Default agent settings used when an agent doesn't override.
"defaults": {
"workspace": "~/.picoclaw/workspace",
"model_name": "openai/gpt-5.4",
"max_iterations": 25,
"max_input_tokens": 128000,
"max_output_tokens": 4096
},

// Provider candidates. API keys live in .security.yml, NOT here.
"models": [
{ "name": "openai/gpt-5.4", "endpoint": "https://api.openai.com/v1" },
{ "name": "anthropic/claude-opus-4-7", "endpoint": "https://api.anthropic.com" },
{ "name": "google/gemini-2.0-flash" },
{ "name": "ollama/qwen3", "endpoint": "http://localhost:11434" }
],

// Cheap-first routing.
"routing": {
"enabled": true,
"light_model": "google/gemini-2.0-flash",
"threshold": 0.35
},

// Per-channel config; most disabled by default.
"channels": {
"telegram": { "enabled": false, "token": "" },
"discord": { "enabled": false, "token": "" },
"slack": { "enabled": false, "bot_token": "", "app_token": "" },
"matrix": { "enabled": false },
"wechat": { "enabled": false }
},

// Tool surface.
"tools": {
"web_search": { "enabled": true, "providers": ["duckduckgo", "brave", "tavily"] },
"shell": { "enabled": true, "approval_required": true },
"fs": { "enabled": true, "root": "~/.picoclaw/workspace" },
"cron": { "enabled": true }
},

// External MCP servers, each isolated in its own process.
"mcp": {
"servers": {
"filesystem": { "command": ["mcp-server-fs"], "enabled": true }
}
},

// Skills marketplace.
"skills": {
"registries": {
"clawhub": { "enabled": true, "url": "https://hub.picoclaw.io" },
"github": { "enabled": true }
},
"installed": []
},

// Hooks: in-process built-ins + external processes.
"hooks": {
"enabled": true,
"observer_timeout_ms": 200,
"interceptor_timeout_ms": 5000,
"approval_timeout_ms": 30000,
"builtins": {
"audit_log": { "enabled": true, "priority": 10 }
},
"processes": {}
},

// Heartbeat for liveness reporting and autoscale signals.
"heartbeat": { "interval_seconds": 30 },

// Web UI gateway.
"gateway": { "host": "127.0.0.1", "port": 18800 }
}
`

Companion file:

`yaml

.security.yml -- separate file, separate permissions

openai:
api_key: sk-...
anthropic:
api_key: sk-ant-...
telegram:
token: 1234:ABC...
`

16. 🗺️ Step-by-Step: Build Your Own PicoClaw-Style Agent

A pragmatic 12-step roadmap. Each step yields a runnable artifact.

Step 1 — 🦴 Skeleton repo

`plaintext yourapp/ cmd/yourapp/main.go # entry pkg/ agent/ bus/ channels/ config/ providers/ routing/ session/ tools/ Makefile config/config.example.json .security.example.yml `

main.go reads config, constructs a Manager, blocks on os.Signal. Nothing else yet.

Step 2 — 🚌 Typed message bus

Define InboundMessage and OutboundMessage with first-class Peer, Sender, MessageID. Build pkg/bus/bus.go as a fan-out dispatcher with bounded per-subscriber queues.

Step 3 — 📺 One channel: stdin/stdout

Implement a stdio channel that reads lines from stdin, emits InboundMessage, prints OutboundMessage. This is your dev harness — no Telegram tokens needed.

Step 4 — 🤖 One provider: OpenAI-compatible

Build the openai_compat provider. Make it streaming. Define a Provider interface with Send(ctx, req) (<-chan Chunk, error).

Step 5 — 🔄 Minimal agent loop

pkg/agent/pipeline_*.go. Setup → LLM → execute (no tools yet) → finalize. Hardcode a system prompt. End-to-end you should now type "hello" and get a streamed reply.

Step 6 — 💾 Sessions on JSONL

Build pkg/session with canonical keys, JSONL backend, .meta.json sidecar, 64-shard mutex. Now conversation persists across runs.

Step 7 — 🛠️ Tools registry

Implement pkg/tools/registry.go with Get, List, Schema(). Add two tools: fs.read and web.fetch. Wire pipeline_execute to call them on parsed tool calls.

Step 8 — 🕹️ Steering

Add per-session FIFO queue + four polling points. Test by sending a follow-up while the agent is running tools — it must skip remaining tools with the explicit "Skipped" tool result.

Step 9 — 🪝 Hooks

Define five hook points + observer events. Build in-process registration first; add JSON-RPC stdio process hooks once the in-process path is solid.

Step 10 — 🧭 Routing

Add pkg/routing classifier with the five features and weighted scoring. Add light_model to config. Verify cheap chat goes to the light model.

Step 11 — 📡 Second channel + capability interfaces

Add Telegram. Define MediaSender, TypingCapable, WebhookHandler capability interfaces. Move retries / splitting / rate-limit into manager.go. The Telegram channel itself should be ~200 lines.

Step 12 — 📦 Cross-compile & ship

`makefile build-all: \tGOOS=linux GOARCH=amd64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-amd64 ./cmd/yourapp \tGOOS=linux GOARCH=arm GOARM=7 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-armv7 ./cmd/yourapp \tGOOS=linux GOARCH=arm64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-arm64 ./cmd/yourapp \tGOOS=linux GOARCH=riscv64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-riscv64 ./cmd/yourapp \tGOOS=linux GOARCH=mipsle GOMIPS=softfloat go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-mipsle ./cmd/yourapp \tGOOS=darwin GOARCH=arm64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-darwin-arm64 ./cmd/yourapp `

Run du -h dist/* — single-digit MB binaries. Confirm with a membench run that peak RSS stays under your target (e.g. 10 MB).

Then add: SubTurns (Step 13), MCP (14), skills marketplace (15), web launcher (16), more channels (17–N).

17. ⚠️ Common Pitfalls & Lessons Learned

These are the traps either explicit in PicoClaw's docs or implied by its design choices.

Pitfall	Mitigation
Goroutine leaks via unbounded fan-out	Bounded queues + `errgroup` per scope (turn, session, channel).
Cross-channel memory crosstalk	Canonical session key from sha256(scope) — never concatenate strings.
Forum/topic chats merging into one conversation	Append `/<topic_id>` to chat values when topic isn't an explicit dimension.
Tool side effects after a user correction	Skip remaining tools on steering arrival; emit explicit skip results.
Orphan SubTurn results crashing parent	16-slot result buffer + `Critical: true` for must-finish work.
`context.Background()` vs parent `ctx` confusion	Document explicitly in your SubTurn API; default to independent timeouts.
API keys in plaintext config	Two files: `config.json` + `.security.yml` with stricter perms.
Memory regressions slipping in	Ship `membench` and gate it in CI.
MIPS LE binaries refused by kernel	Patch ELF `e_flags` at offset 36 after build.
Hooks blocking turns	Per-class timeouts: observer 200ms, interceptor 5s, approval 30s.
Rebuilding when adding a provider	Provider config is `protocol/model` strings; factory dispatches at runtime.
Schema drift between sessions	Lazy migration in JSONL backend; never edit applied "migrations" — append new ones.
Routing rules buried in code	Routing is data — JSON rules + features. Hot-reload friendly.
30 channels each duplicating retry logic	Centralize retry/split/rate-limit in `manager.go`; channels send a single chunk.
MCP server bug killing the agent	Spawn each MCP server in an isolated process via `isolated_command_transport`.
One mutex around the session store	64-shard mutex array on hash(key).

18. 📖 Recommended Reading Path Through the PicoClaw Source

If you read these files in this order, the architecture clicks fast:

cmd/picoclaw/main.go — the boot sequence.
pkg/bus/types.go — the typed message contract that flows through the whole system.
pkg/agent/definition.go — what an agent is as data.
pkg/agent/pipeline.go → pipeline_setup.go → pipeline_llm.go → pipeline_execute.go → pipeline_finalize.go — the loop.
pkg/agent/turn_coord.go — the brains tying routing, providers, and steering together.
pkg/agent/steering.go — the most copy-worthy single concept in the project.
pkg/agent/subturn.go — sub-agent semantics.
pkg/session/manager.go + jsonl_backend.go + allocator.go — durable state.
pkg/routing/router.go + classifier.go + features.go — cheap-first routing.
pkg/agent/hooks.go + hook_mount.go + hook_process.go — extensibility.
pkg/channels/manager.go + base.go + interfaces.go — channel abstraction.
pkg/providers/factory.go + cooldown.go + fallback.go + error_classifier.go — provider reliability stack.
pkg/tools/registry.go + toolloop.go — tool execution.
pkg/mcp/manager.go + isolated_command_transport.go — MCP integration.
pkg/skills/registry.go + installer.go — plugin marketplace.
Makefile — cross-compilation matrix, ELF patching, version stamping.
docs/architecture/*.md — official narrative for steering, subturn, sessions, routing, hooks.

🎯 TL;DR — The Recipe in One Page

Use Go. Static binaries, small RSS, uniform across architectures.
Typed message bus with first-class Peer, Sender, MessageID.
Pipelined agent loop: setup → LLM → tools → finalize, with a turn state struct.
Steering: per-session FIFO queue polled at 4 checkpoints; skipped tools get explicit results.
SubTurns with depth ≤ 3, concurrency ≤ 5, independent timeouts, Critical flag for must-finish.
Sessions: structured SessionScope → canonical sk_v1_<sha256> key, JSONL + .meta.json, 64-shard locking.
Routing: classifier with 5 structural features, weighted score, light_model below threshold.
Hooks: 5 sync points + observer events, in-process or JSON-RPC over stdio, per-class timeouts.
Channels: each in its own sub-package, embed BaseChannel, declare optional capabilities by interface, manager owns retries/splitting/rate-limit.
Providers: factory + facades + cooldown + ratelimiter + fallback + error_classifier, configured by protocol/model strings, secrets in .security.yml.
Tools / MCP / Skills: in-process tools for built-ins; MCP for untrusted external tools (isolated transport); skills as installable bundles from a registry.
Bounded queues, streaming, lazy init, -ldflags="-s -w", -trimpath, membench regression gate.
Cross-compile to amd64/arm/arm64/riscv64/mipsle + Darwin + Windows + NetBSD; patch MIPS ELF e_flags; ship a launcher that auto-picks the binary.

Build steps 1–12 from §16 in order, validate with the patterns in §17, and you have a PicoClaw-class agent.

If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃

Table of Contents

1. 🧩 What PicoClaw Is and Why It Matters

2. 🎯 Design Philosophy

2.1 🪶 Lean by default, extensible by interface

2.2 📦 One binary, every architecture

2.3 💾 Append-first persistence (JSONL)

2.4 🗂️ Promote routing data to first-class fields

2.5 🔍 Capabilities are discovered, not hardcoded

2.6 💰 Cheap-first, escalate when necessary

2.7 👁️ Observe everything, intercept rarely

2.8 🕹️ The user can change their mind mid-run

3. 🏗️ High-Level Architecture

4. 🔄 Core Concept #1 — The Agent Loop & Pipeline

Actionable patterns to copy

5. 🕹️ Core Concept #2 — Steering (Mid-Loop Message Injection)

Why this matters

Modes & limits

Public API to copy

Implementation notes

6. 🤝 Core Concept #3 — SubTurn (Hierarchical Sub-Agents)

Properties

When the parent polls results

Why context derives from context.Background(), not the parent's ctx

Pattern to copy

Pitfalls

7. 💾 Core Concept #4 — Sessions & JSONL Persistence

7.1 🪪 SessionScope — the structured identity of a conversation

7.2 🔑 Two key formats

7.3 📄 JSONL on disk

7.4 📐 Allocator rules

7.5 ⚡ Concurrency

7.6 🔀 Migration

Actionable patterns

8. 🧭 Core Concept #5 — Rule-Based Model Routing

8.1 ⚙️ Configuration

8.2 🔬 Features extracted per turn

8.3 ⚖️ Weighted scoring (clamped to [0,1])

8.4 🔌 Where it plugs in

Pattern to copy

9. 🪝 Core Concept #6 — The Hook System

9.1 🔗 The five synchronous points

9.2 🔄 In-process vs out-of-process

9.3 📡 JSON-RPC framing

9.4 ⚙️ Configuration shape

9.5 📋 Hook ordering

What hooks are NOT for

Patterns to copy

10. 📡 Core Concept #7 — Channel Abstraction (18+ chat platforms)

10.1 📜 The contract

10.2 🔌 Capability interfaces (optional)

10.3 🗂️ First-class fields, not metadata bags

10.4 🎛️ Centralized orchestration in the manager

10.5 🪪 Identity normalization

Patterns to copy

11. 🤖 Core Concept #8 — Provider Abstraction (30+ LLMs)

11.1 📁 Layout

11.2 🔌 The provider interface (conceptual)

11.3 🛡️ Reliability stack (the part most projects miss)

Patterns to copy

12. 🛠️ Core Concept #9 — Tools, Skills, and MCP

12.1 🔧 Tools — built-in, in-process

12.2 📚 Skills — installable plugins

12.3 🔗 MCP — Model Context Protocol

Patterns to copy

13. ⚡ Resource-Efficiency Techniques (the <10MB secret)

13.1 🐹 Choice of Go

13.2 🧵 Minimal goroutine surface

13.3 🚧 Bounded queues everywhere

13.4 🌊 Streaming, not buffering

13.5 📄 JSONL append-only persistence

13.6 😴 Lazy initialization

13.7 📊 membench as a regression gate

13.8 🔧 Architecture-aware patches

14. 📦 Cross-Compilation & Single-Binary Deployment

14.1 🔨 The build matrix (make build-all)

14.2 🏷️ Version stamping

14.3 🚀 Single-binary delivery

15. ⚙️ Reference Configuration Schema

.security.yml -- separate file, separate permissions

16. 🗺️ Step-by-Step: Build Your Own PicoClaw-Style Agent

Why context derives from `context.Background()`, not the parent's `ctx`

13.7 📊 `membench` as a regression gate

14.1 🔨 The build matrix (`make build-all`)