dmae97

Posted on May 1 • Originally published at github.com

I Made Kimi Build Its Own Tiny Coding Team in 2 Days

#ai #opensource #cli #agents

I built a small experimental CLI called oh-my-kimichan.

The funny part: I built it with Kimi Code CLI itself.

The original idea was not serious.

I just wanted to see if Kimi could help me build a small orchestration layer around Kimi Code CLI — something that turns a single coding assistant into a tiny coding team.

Two days later, the project had:

19 GitHub stars
360+ npm downloads
a published npm CLI
a DAG runner
run state
logs/replay
worktree isolation
evidence gates
a live terminal HUD

It is still rough.

It is not production-grade.

It is not an official Moonshot AI project.

But it became interesting enough that I decided to keep polishing it in public.

What is oh-my-kimichan?

oh-my-kimichan is an unofficial experimental wrapper around Kimi Code CLI.

The goal is simple:

Turn Kimi Code CLI into a small spec-driven coding team.

Instead of only running one prompt, OMK tries to structure the workflow like this:

goal
→ spec / plan
→ DAG
→ workers
→ reviewer
→ evidence gates
→ summary

The current CLI looks roughly like this:

npm i -g oh-my-kimichan

omk init
omk doctor
omk plan "add login"
omk parallel "implement login"
omk verify
omk logs latest

Why I built it

I have been experimenting with multi-agent coding workflows for a while.

Most tools in this space are centered around Claude Code, Codex, or OpenCode.

But Kimi Code CLI felt surprisingly strong in my own workflow.

So I wanted to test a simple question:

What would a Kimi-native orchestration CLI look like?

Not a huge platform.

Not a full replacement for Claude Code or OpenCode.

Just a small layer that understands Kimi's behavior and adds structure around it.

What it currently does

The current prototype includes:

Stable Kimi runner

The runner now separates:

exit code
stderr
timeout
MCP failure
--print mode failure
debug logs

Earlier, I made the mistake of treating "stdout exists" as success.

That was convenient for a prototype, but wrong for a real runner.

Now OMK is moving toward stricter failure classification.

Run state

Each run creates a local state directory:

.omk/runs/<run-id>/
  goal.md
  plan.md
  state.json
  events.jsonl
  artifacts/
  logs/
  workers/
  evidence.json
  summary.md

This makes the run inspectable and replayable.

omk runs
omk status
omk logs latest
omk replay <run-id>

DAG execution

OMK uses a DAG-based execution model.

Each node can have:

dependencies
role
retry policy
timeout
failure policy
evidence requirements

The current shape is still simple, but it already makes the workflow easier to reason about than one giant prompt.

Worktree isolation

Worker execution is moving toward real Git worktree isolation.

The target structure is:

git worktree add .omk/worktrees/<run-id>/<worker-id> -b omk/<run-id>/<worker-id>

This should make parallel agent work safer and easier to merge.

Evidence gates

Instead of trusting the model's final message, OMK verifies concrete evidence.

Example:

{
  "node": "reviewer",
  "required": [
    { "type": "file-exists", "path": "src/auth/service.ts" },
    { "type": "command-pass", "command": "npm test" },
    { "type": "diff-nonempty" },
    { "type": "summary-present" }
  ]
}

The goal is to make agent output less hand-wavy.

New direction: spec-driven execution

I recently started connecting OMK with Spec Kit-style planning.

The idea is:

spec
→ plan
→ tasks
→ OMK DAG
→ Kimi workers
→ evidence gates
→ report

So the future flow should feel like this:

omk feature "add OAuth login"

Internally, OMK should:

create or detect a spec
generate a plan
convert tasks into a DAG
run Kimi workers in isolated worktrees
verify evidence
generate a report

That is the direction I want to explore next.

What went wrong

A lot.

The first version had rough edges:

too much README marketing
not enough hard verification
weak failure classification
workspace copy pretending to be worktree isolation
experimental commands mixed with stable commands
debug logs leaking into normal output

The biggest lesson:

Multi-agent coding tools should not trust text output.

They need state, logs, evidence, replay, and verification.

What I am building next

For the next few days, I am focusing on:

omk spec
omk dag from-spec
omk feature
better omk summary
better omk mcp doctor
real worktree merge flow
cleaner README and demo GIF

The current goal is not to build a huge agent platform.

The goal is smaller:

Make Kimi Code CLI feel like a tiny, inspectable, spec-driven coding team.

Repository

GitHub:

https://github.com/dmae97/oh-my-kimichan

Install:

npm i -g oh-my-kimichan

Try:

omk init
omk doctor
omk plan "add login"
omk parallel "implement login"

Again: this is experimental and unofficial.

But if you are interested in Kimi Code CLI, agent orchestration, DAG-based coding workflows, or spec-driven AI development, I would love feedback.

DEV Community