Yechua Silva

Posted on Jun 30

ZyroCLI: An Orchestrator That Gives AI Agents a Memory — and a Conscience

#opensource #ai #go #python

The Problem: Every AI Session Starts at Zero

If you've used AI coding assistants, you know the loop:

You describe your project.
The agent writes code.
You close the session.
Tomorrow, you start over.

The agent doesn't remember architectural decisions. It doesn't know why you chose PostgreSQL over MongoDB. It doesn't recall that auth.go was refactored last week. Every conversation is stateless — and stateless agents produce inconsistent, repetitive, and often conflicting code.

This isn't a tooling problem. It's a memory problem.

What ZyroCLI Actually Does

ZyroCLI is an orchestrator for OpenCode that runs a full software development pipeline (SDD) through specialized AI agents. It doesn't replace your IDE or your chat interface. It sits between you and the agent, enforcing structure, persistence, and safety.

The pipeline has five macro-phases:

Phase	What happens
PRE-F0	Alignment, domain modeling, architectural triage
F0	Research: patterns, libraries, skills discovery
F1	Specification: technical PRD with deep modules
F2	Design: component breakdown into atomic tasks
F3	Implementation: code + tests with verification loops
F4	Closure: archive, lint, final build

Each phase is executed by a different agent skill, with its own LLM model, its own security policy, and its own memory context.

Four Design Decisions Worth Understanding

1. Causal Memory (Not Chat History)

Most tools append messages to a context window and hope for the best. ZyroCLI stores decisions as facts in a causal graph (HelixDB, graph + vector database).

Each fact is a node. Relationships like CAUSED, PRECEDES, and CONTRADICTS are edges. Before every phase, the orchestrator queries the graph for relevant context. After every phase, it extracts new facts and resolves contradictions.

Why a graph? Because "we chose JWT over session cookies" is a decision that causes downstream constraints. A vector search finds semantically similar facts. A graph traversal finds causally related ones. You need both.

2. Agent-as-Validator (Separation of Opinion and Execution)

The Python agent never writes to the database. It returns a AgentDecision (validated by Pydantic), and the Go orchestrator decides whether to execute it.

This matters because unconstrained agents hallucinate tools, skip phases, and corrupt state. Separating "what the agent thinks" from "what actually runs" is the same pattern we use in human code review — but automated and enforced.

3. Security by Phase (Boundari)

Not every phase needs the same permissions. The research agent (F0) gets read-only access. The implementation agent (F3) gets write access but requires approval for dangerous commands. Policies are YAML-defined and loaded dynamically.

This is the principle of least privilege applied to AI agents. An agent researching libraries shouldn't be able to rm -rf your repo.

4. Hybrid Search with Local Embeddings

Memory queries use vector ANN + BM25 + RRF fusion. Embeddings run locally via Ollama (nomic-embed-text, 768d, GPU via Vulkan). If Ollama isn't available, it degrades gracefully to BM25.

Local embeddings mean your project context never leaves your machine. For teams working with proprietary code, this isn't optional.

The Boomerang Cycle: Micro-Execution Inside Each Phase

Each macro-phase runs an internal 6-step cycle (Memory → Think → Delegate → Git Check → Quality → Save). But not all phases need all steps.

ZyroCLI v3.1.0 implements a Phase Skip Matrix:

           MEMORY  THINK  DELEGATE  GIT CHECK  QUALITY  SAVE
PRE-F0  │   ✅      ✅      ✅        —          —       ✅
F0      │   ✅      ✅      ✅        —          —       ✅
F1      │   ✅      ✅      ✅        —          —       ✅
F2      │   ✅      ✅      ✅        —          —       ✅
F3      │   ✅      ✅      ✅        ✅         ✅      ✅
F4      │   ✅      —       ✅        ✅         —       ✅

F0-F2 are research/design — no git validation needed. F3 is implementation — needs tests and quality gates. F4 is closure — archive only. This cuts ~40% of overhead versus running all six steps unconditionally.

Benchmarks: The Honest Numbers

We ran 24 completed sessions (3 incremental features × 3 iterations × 3 approaches) using DeepSeek V4 Flash via OpenCode Zen (free tier).

Metric	Plain OpenCode	gentle-ai	ZyroCLI
Tokens Input	199,817	262,069	213,391
Tests Generated	0%	0%	27-66% coverage
Modularity	1 file	1 file	2-3 files
Avg Time	34s	41s	121s

What this means:

ZyroCLI uses fewer input tokens than gentle-ai (-35%) because causal memory injects filtered context instead of dumping the entire chat history.
It's the only approach that generates tests automatically and refactors into multiple files.
It's 2.5× slower. The Boomerang cycle has real overhead. For quick one-off tasks, plain OpenCode is faster.
For multi-file features where consistency matters, the tradeoff is worth it.

Architecture at a Glance

Human ──→ ZyroCLI (Go) ──→ OpenCode (AI Agents)
              │                     │
              ├── HelixDB (graphs)  ├── PydanticAI (validation)
              ├── Boundari (security) ├── Boomerang (execution cycle)
              └── Causal Memory     └── Skills (16 specialized agents)

The Go orchestrator handles scheduling, security, and persistence. The Python agents handle reasoning, planning, and code generation. The boundary between them is strict by design.

Who Is This For?

ZyroCLI is not for everyone.

Use it if:

You're building features that span multiple files and sessions.
You need architectural consistency across AI-generated code.
You want automated tests and quality gates without manual prompting.
You work with proprietary code and need local embeddings.

Don't use it if:

You need a quick one-file script. The overhead isn't worth it.
You prefer manual control over every line of AI-generated code.
You're on hardware that can't run Ollama locally.

How It Helped Build Real Projects

ZyroCLI was dogfooded to develop two other projects:

Cadence: An autonomous job search agent with multi-agent debate (Advocate + Critic + Judge), semantic caching, and zero-cost architecture. The SDD pipeline ensured the LangGraph orchestrator, PydanticAI agents, and browser automation layers stayed consistent across 20+ nodes.
AgentTrail: A cryptographic audit trail SDK for EU AI Act compliance. The F1-F2 phases generated the technical specification for SHA-256 chaining and Ed25519 signatures before any code was written.

In both cases, the pipeline prevented the "agent amnesia" that usually hits after session 3.

Getting Started

# Install
curl -sSL https://github.com/yechua-silva/zyrocli/releases/latest/download/install.sh | bash
zyrocli install
zyrocli doctor

# Create a handoff
cat > handoff.yaml << 'EOF'
project:
  name: my-api
  description: REST API with JWT auth
  technologies: [Go, PostgreSQL, JWT]
EOF

# Run the pipeline
zyrocli init handoff.yaml
zyrocli run --phase F0

The TUI (zyrocli without args) walks you through model selection, GPU detection, and per-agent LLM routing.

The Bigger Picture

AI-assisted development in 2026 is shifting from "generate code faster" to "generate consistent code across sessions." The bottleneck isn't typing speed — it's context management.

ZyroCLI treats this as an infrastructure problem. Memory, security, and execution control are first-class concerns, not afterthoughts. The result is an agent that remembers why it built things the way it did — and can explain it to the next agent, or to you, six months later.

Links:

MIT License. Contributions welcome.

DEV Community