The Problem: Every AI Session Starts at Zero
If you've used AI coding assistants, you know the loop:
- You describe your project.
- The agent writes code.
- You close the session.
- Tomorrow, you start over.
The agent doesn't remember architectural decisions. It doesn't know why you chose PostgreSQL over MongoDB. It doesn't recall that auth.go was refactored last week. Every conversation is stateless — and stateless agents produce inconsistent, repetitive, and often conflicting code.
This isn't a tooling problem. It's a memory problem.
What ZyroCLI Actually Does
ZyroCLI is an orchestrator for OpenCode that runs a full software development pipeline (SDD) through specialized AI agents. It doesn't replace your IDE or your chat interface. It sits between you and the agent, enforcing structure, persistence, and safety.
The pipeline has five macro-phases:
| Phase | What happens |
|---|---|
| PRE-F0 | Alignment, domain modeling, architectural triage |
| F0 | Research: patterns, libraries, skills discovery |
| F1 | Specification: technical PRD with deep modules |
| F2 | Design: component breakdown into atomic tasks |
| F3 | Implementation: code + tests with verification loops |
| F4 | Closure: archive, lint, final build |
Each phase is executed by a different agent skill, with its own LLM model, its own security policy, and its own memory context.
Four Design Decisions Worth Understanding
1. Causal Memory (Not Chat History)
Most tools append messages to a context window and hope for the best. ZyroCLI stores decisions as facts in a causal graph (HelixDB, graph + vector database).
Each fact is a node. Relationships like CAUSED, PRECEDES, and CONTRADICTS are edges. Before every phase, the orchestrator queries the graph for relevant context. After every phase, it extracts new facts and resolves contradictions.
Why a graph? Because "we chose JWT over session cookies" is a decision that causes downstream constraints. A vector search finds semantically similar facts. A graph traversal finds causally related ones. You need both.
2. Agent-as-Validator (Separation of Opinion and Execution)
The Python agent never writes to the database. It returns a AgentDecision (validated by Pydantic), and the Go orchestrator decides whether to execute it.
This matters because unconstrained agents hallucinate tools, skip phases, and corrupt state. Separating "what the agent thinks" from "what actually runs" is the same pattern we use in human code review — but automated and enforced.
3. Security by Phase (Boundari)
Not every phase needs the same permissions. The research agent (F0) gets read-only access. The implementation agent (F3) gets write access but requires approval for dangerous commands. Policies are YAML-defined and loaded dynamically.
This is the principle of least privilege applied to AI agents. An agent researching libraries shouldn't be able to rm -rf your repo.
4. Hybrid Search with Local Embeddings
Memory queries use vector ANN + BM25 + RRF fusion. Embeddings run locally via Ollama (nomic-embed-text, 768d, GPU via Vulkan). If Ollama isn't available, it degrades gracefully to BM25.
Local embeddings mean your project context never leaves your machine. For teams working with proprietary code, this isn't optional.
The Boomerang Cycle: Micro-Execution Inside Each Phase
Each macro-phase runs an internal 6-step cycle (Memory → Think → Delegate → Git Check → Quality → Save). But not all phases need all steps.
ZyroCLI v3.1.0 implements a Phase Skip Matrix:
MEMORY THINK DELEGATE GIT CHECK QUALITY SAVE
PRE-F0 │ ✅ ✅ ✅ — — ✅
F0 │ ✅ ✅ ✅ — — ✅
F1 │ ✅ ✅ ✅ — — ✅
F2 │ ✅ ✅ ✅ — — ✅
F3 │ ✅ ✅ ✅ ✅ ✅ ✅
F4 │ ✅ — ✅ ✅ — ✅
F0-F2 are research/design — no git validation needed. F3 is implementation — needs tests and quality gates. F4 is closure — archive only. This cuts ~40% of overhead versus running all six steps unconditionally.
Benchmarks: The Honest Numbers
We ran 24 completed sessions (3 incremental features × 3 iterations × 3 approaches) using DeepSeek V4 Flash via OpenCode Zen (free tier).
| Metric | Plain OpenCode | gentle-ai | ZyroCLI |
|---|---|---|---|
| Tokens Input | 199,817 | 262,069 | 213,391 |
| Tests Generated | 0% | 0% | 27-66% coverage |
| Modularity | 1 file | 1 file | 2-3 files |
| Avg Time | 34s | 41s | 121s |
What this means:
- ZyroCLI uses fewer input tokens than gentle-ai (-35%) because causal memory injects filtered context instead of dumping the entire chat history.
- It's the only approach that generates tests automatically and refactors into multiple files.
- It's 2.5× slower. The Boomerang cycle has real overhead. For quick one-off tasks, plain OpenCode is faster.
- For multi-file features where consistency matters, the tradeoff is worth it.
Architecture at a Glance
Human ──→ ZyroCLI (Go) ──→ OpenCode (AI Agents)
│ │
├── HelixDB (graphs) ├── PydanticAI (validation)
├── Boundari (security) ├── Boomerang (execution cycle)
└── Causal Memory └── Skills (16 specialized agents)
The Go orchestrator handles scheduling, security, and persistence. The Python agents handle reasoning, planning, and code generation. The boundary between them is strict by design.
Who Is This For?
ZyroCLI is not for everyone.
Use it if:
- You're building features that span multiple files and sessions.
- You need architectural consistency across AI-generated code.
- You want automated tests and quality gates without manual prompting.
- You work with proprietary code and need local embeddings.
Don't use it if:
- You need a quick one-file script. The overhead isn't worth it.
- You prefer manual control over every line of AI-generated code.
- You're on hardware that can't run Ollama locally.
How It Helped Build Real Projects
ZyroCLI was dogfooded to develop two other projects:
Cadence: An autonomous job search agent with multi-agent debate (Advocate + Critic + Judge), semantic caching, and zero-cost architecture. The SDD pipeline ensured the LangGraph orchestrator, PydanticAI agents, and browser automation layers stayed consistent across 20+ nodes.
AgentTrail: A cryptographic audit trail SDK for EU AI Act compliance. The F1-F2 phases generated the technical specification for SHA-256 chaining and Ed25519 signatures before any code was written.
In both cases, the pipeline prevented the "agent amnesia" that usually hits after session 3.
Getting Started
# Install
curl -sSL https://github.com/yechua-silva/zyrocli/releases/latest/download/install.sh | bash
zyrocli install
zyrocli doctor
# Create a handoff
cat > handoff.yaml << 'EOF'
project:
name: my-api
description: REST API with JWT auth
technologies: [Go, PostgreSQL, JWT]
EOF
# Run the pipeline
zyrocli init handoff.yaml
zyrocli run --phase F0
The TUI (zyrocli without args) walks you through model selection, GPU detection, and per-agent LLM routing.
The Bigger Picture
AI-assisted development in 2026 is shifting from "generate code faster" to "generate consistent code across sessions." The bottleneck isn't typing speed — it's context management.
ZyroCLI treats this as an infrastructure problem. Memory, security, and execution control are first-class concerns, not afterthoughts. The result is an agent that remembers why it built things the way it did — and can explain it to the next agent, or to you, six months later.
Links:
- GitHub: github.com/yechua-silva/zyrocli
- LinkedIn: linkedin.com/in/yechua-silva
- Related: Cadence · AgentTrail
MIT License. Contributions welcome.
Top comments (0)