karajan-code@3.0.0 is on npm. nvm install 22.22.1 && npm i -g karajan-code@3 and you're done.
Why I built Karajan
I love coding with AI. What I don't love is babysitting it.
If you've spent any time with Claude Code, Codex, or Gemini, you know the loop:
- You describe what you want.
- You watch the model work.
- You catch it skipping tests, hallucinating an API, or touching the wrong layer, even with a solid
CLAUDE.md/AGENTS.mdand SKILLs in place. - You stop it, correct it, relaunch it.
- Repeat.
Fine for a one-off change. Doesn't scale to a real project with a known plan, real tests, and a real quality gate. You end up being the human stuck in the middle of an async loop you can't pause, retry, or measure.
I wanted something different. A tool where I describe what I want once and, from there, a set of AI agents (a coder, a reviewer, a tester, a SonarQube gate, a security auditor) run the loop themselves and ping me when it's done. Reproducibly. Locally. Using the subscriptions I already pay for.
That tool is Karajan.
The name is a nod to Herbert von Karajan, the conductor who never played a single instrument but imposed a unified vision on dozens of musicians until they sounded like one. Karajan didn't replace the musicians: he coordinated them under a single guiding intent. Karajan Code does the same with your AI agents: none of them is in charge, all of them converge under a single baton that arbitrates and unifies.
What Karajan actually is
Karajan is a local-first multi-agent orchestrator written in vanilla JavaScript on Node 22.
You describe a task (in a markdown file, a user story, or directly on the command line) and Karajan runs a role-based pipeline over it:
task
→ triage (classifies complexity, decides route)
→ researcher (reads existing code)
→ architect (designs the change)
→ planner (breaks it into steps)
→ coder (writes code and tests)
→ SonarQube (static analysis as a gate)
→ tester (runs the real test suite)
→ reviewer (independent agent reviews the diff)
→ security (security pass)
→ commiter (commit, push, opens PR)
Each role is a CLI agent. The coder can be Claude, the reviewer can be Codex, the tester can be Aider. Karajan doesn't call /v1/messages: it spawns claude -p, codex, gemini, aider, opencode as child processes and reads their stdout/stderr. That single architectural decision unlocks two things:
- Zero API cost. Your existing Pro / Plus / Max subscriptions do the work. No new bills.
- Multi-provider routing. A powerful coder + a strict reviewer + a cheap triage agent = the right model for each role, with automatic fallback when one hits its quota.
The whole system is TDD-first by default. The coder writes tests alongside the code, the tester runs them for real, the reviewer sees the diff. If something fails, the reviewer feeds back to the coder and the loop iterates, bounded by maxIterations and maxIterationMinutes so it can't spiral.
A few more pieces worth knowing:
-
MCP server. Karajan ships as an MCP server, so you can drive the full pipeline from inside Claude Code, Cursor, Codex, or any MCP host.
kj_run,kj_plan,kj_code,kj_review,kj_audit,kj_rag_query: full pipeline access without leaving the conversation. - Solomon, the AI judge. An independent role the Brain orchestrator consults when there's a real dilemma (security vs. deadline, two reviewers disagreeing). Not on every loop; only when the Brain can't decide deterministically.
-
Story Board. A small local web dashboard at
http://localhost:4000showing every user story, every session, every plan, every RAG metric. The single source of truth for the team. -
Local RAG. Each project gets its own embedded code index (
~/.karajan/rag.db). The pipeline retrieves real code before planning, so the architect sees what already exists. Six embedding providers (Ollama, OpenAI, Voyage, Cohere, Mistral, local ONNX). Tree-sitter-based chunkers for JS, TS, Python, Rust, Go, and Java. -
kj audit. A deterministic-first audit command: dead code, unused dependencies, complexity growth, security findings (OSV, Semgrep, SonarQube), accessibility hints, Web Performance budgets, and an AI Harness Scorecard run via Docker that gives an objective 0-100 score for "how AI-friendly is this repo right now." Keeps a per-project history and draws sparkline trends.
What v3.0.0 brings
v3.0.0 is a runtime alignment, not a feature drop.
Node 20 hit EOL on 2026-04-30. The Active LTS line is Node 22. Three of Karajan's dependencies independently bumped to majors requiring Node 22+: lint-staged 17, commander 15, and better-sqlite3 12.10. Rather than publishing four staggered minors patching each constraint separately, v3.0.0 bundles the runtime bump with the dep majors that depend on it.
Migration is a single command:
nvm install 22.22.1 && nvm use 22.22.1
npm install -g karajan-code@3
kj doctor # checks runtime + HW + tooling
If you were already on Node 22+ and your ~/.karajan/ works, nothing changes. No public API changes: kj run, kj plan, MCP tools, role templates, RAG, Story Board, audit, telemetry, all intact. Existing sessions, plans, RAG index, and audit history are forward-compatible; nothing to migrate by hand.
Why a major with no API changes?
Semver. Bumping the minimum Node version is a breaking change for downstream consumers: an install that worked on Node 20 now hard-fails. That alone qualifies, and the three dep majors on board each qualify on their own semver. Cutting a single v3.0.0 surfaces the runtime change once instead of four times.
The release also brings documentation polish: a new footprint and hardware requirements section in the README so you know what you're getting into before installing. kj itself is 5.2 MB on npm and ~/.karajan/ runs around 40 MB. Ollama (optional, 6.55 GB), SonarQube (optional, 1.47 GB), and the qmd cache (optional, ~2.2 GB) are the heavy optional layers. Three profiles: Minimal ~250 MB, Recommended ~8.5 GB, Full ~11 GB.
v1 to v3: how Karajan grew
Karajan didn't start as a multi-agent orchestrator. It started as a shell script.
- v0.x: One script. task → claude → diff → codex review → done. Hardcoded to two agents. No retry, no cost tracking, no SonarQube, no test integration. It worked, roughly, on a Tuesday afternoon.
-
v1.0 to v1.1: Quality gates and roles. SonarQube became a mandatory step between coder and reviewer. The TDD policy meant any code change required a test change. Then the monolithic script became a role-based pipeline:
BaseRole,BaseAgent, 12 configurable roles, review profiles, Solomon escalation, budget tracking. - v1.2 to v1.3: MCP server and extensibility. The MCP server made Karajan callable from inside any MCP-speaking AI agent. The plugin system let users wrap their own CLIs as Karajan agents without forking. Planning Game integration connected the pipeline to project management. Git automation closed the loop with auto-commit, auto-PR, auto-rebase.
- v1.4 to v1.7: Resilience. Rate-limit detection across all supported CLIs. Automatic fallback when the primary coder hits its quota. Smart model selection mapping triage complexity to model tier (trivial → Haiku, complex → Opus). Interactive checkpoints replacing the brittle hard timeout. In-process MCP execution so subprocess SIGKILLs stopped killing agents mid-work.
-
v1.40 to v1.46: Sovereignty, OpenSkills, parallel stories. The pipeline learned to defend its own decisions against host AI overrides (pipeline sovereignty guard). OpenSkills brought domain knowledge to coders, reviewers, and architects. Parallel user stories ran independent stories concurrently in git worktrees. First SEA binaries appeared:
kjrunning without Node installed. -
v1.50 to v1.58: Telemetry, domain knowledge, i18n, WebPerf. Telemetry (opt-out, no code or personal data) to finally measure what was happening. Domain knowledge files (
DOMAIN.md) injecting business context into every role. i18n letting the pipeline speak EN or ES end to end. Web Performance as a first-class quality gate (Core Web Vitals via Chrome DevTools MCP, inspired by Joan Leon's WebPerf Snippets). - v2.0: Karajan Brain. A central AI orchestrator deciding routing, enriching feedback, suggesting direct actions, and compressing outputs between roles. Solomon moved from "every loop" to "real dilemmas only." Tester and security became blocking gates. The proxy subsystem (which never worked well under SSE/WebSockets) was dropped in favor of RTK for token savings.
-
v2.x: The "I use this daily" phase. Too many releases to list, but the highlights: stack-aware audit (v2.9.0), Docker/shell installer (v2.10.0), Story Board hardening (v2.13.0), plan adherence scoring + golden tasks (v2.12.0), shared team Story Board (v2.31.0), editable config UI (v2.30.0), multi-language RAG with AST chunkers for Python/Rust/Go/Java (v2.34.0), retrieval quality harness (
kj rag eval, v2.34.0), and the AI Harness Scorecard golden metric integrated intokj audit(v2.32.0-v2.33.0). - v3.0.0: Runtime alignment. No new features. Node 22 baseline, three dep majors, one migration command, zero public API change. The story of "boring releases that buy you a year of headroom."
That arc, compressed: a shell script → a role-based pipeline → an orchestrator with a brain → a local-first, MCP-driven, RAG-powered, multilingual, multi-provider, test-first team tool. The same idea throughout: many different AI agents, one baton that makes them converge.
Getting started
Install:
nvm install 22.22.1 && nvm use 22.22.1
npm install -g karajan-code@3
Smoke test:
kj --version # 3.0.0
kj doctor # checks Node, agents, Docker, SonarQube, RAG...
Initialize a project:
cd your-repo
kj init # autodetects stack, suggests agents, scaffolds .karajan/
Run a task:
kj run "Add a /healthz endpoint with a unit test that returns 200" \
--coder claude --reviewer codex
Or from inside Claude Code / Cursor / any MCP host, just call kj_run.
Open the Story Board:
kj board start # http://localhost:4000
That's the loop. Describe what you want. Let the orchestra play.
Where to find it
-
npm:
karajan-code - Docs: https://karajancode.com/docs
- Repo: github.com/manufosela/karajan-code
- License: AGPL-3.0
- Author: @manufosela, built in the open.
If you try it and something surprises you (good or bad), I want to know. Open an issue, reach out on GitHub, or reply here.
Top comments (0)