David Van Assche (S.L)

Posted on Apr 15

Every AI Coding CLI in 2026: The Complete Map (30+ Tools Compared)

#ai #devtools #productivity #beginners

A sequel to my most-read post. Six months later, the landscape exploded. Here's every tool that matters, what it costs, and what it actually does.

The AI coding tool market went from "a few options" to "overwhelming" in about six months. New CLIs weekly. Pricing wars. Open-source alternatives rivaling the paid ones. Chinese models hitting 77%+ on SWE-bench. Free tiers that would've been unthinkable a year ago.

I've tested, researched, or tracked 30+ tools. Here's the complete map.

Tier 1: Cloud Subscriptions (Pay Monthly, They Host Everything)

These are the "just works" options. You pay, they handle models, infrastructure, and updates.

Tool	Monthly Cost	Model(s)	Type	SWE-bench	Notable
Claude Code	$17-20 (Pro), $100-200 (Max)	Claude 4.6 Opus/Sonnet	Terminal agent	80.9%	1M context. Uses 5.5x fewer tokens than Cursor. Full hook/plugin system.
Cursor	$16/mo	Multi-model	VS Code fork	Varies	Largest community. Best tab completions. Most polished UX.
Windsurf	$20/mo	Multi-model	IDE	Varies	"Flows" persistent context. Raised from $15 in March 2026.
Codex CLI	With ChatGPT Plus ($20/mo)	GPT-5 series	CLI + Desktop	—	Cloud sandbox execution. Autonomous agent.
Antigravity	$20 (Pro), $250 (Ultra)	Gemini	Agent IDE	—	Google's entry. Parallel agents. Built-in Chrome for testing.
Mistral Vibe	$15/mo (Le Chat Pro)	Devstral 2	CLI	—	Apache 2.0 source code. Paid models.
Amp (Sourcegraph)	Free tier ($10/day cap)	Multi-model	CLI + IDE	—	"Deep mode" autonomous research. No markup on API costs.

The verdict: Claude Code wins on capability (1M context, best SWE-bench, hook system). Cursor wins on UX. Windsurf and Antigravity bet on parallel agents. Codex bets on cloud sandboxing.

Token efficiency matters more than subscription price. Claude Code using 5.5x fewer tokens than Cursor means the real cost difference is bigger than the $1-4/mo subscription gap suggests.

Tier 2: Genuinely Free (Real Usage, No Tricks)

These tools offer meaningful free access — not "free trial" but actually usable for daily work:

Tool	Free Tier	What You Get	Upgrade Path
Gemini CLI	1,000 requests/day	Gemini 2.5 Pro/Flash routing. Just login with Google.	Pay-as-you-go
GitHub Copilot CLI	50 premium requests/mo	Deep GitHub integration. Natural for existing users.	$10/mo
Amazon Q Developer	Free tier	Best for AWS-heavy workflows.	AWS pricing
Kiro (Amazon)	Free tier	Spec-driven: generates requirements before code. Auditable trail.	TBD
Qwen Code	Free API (!)	Alibaba's CLI agent. Apache 2.0. Completely free API access.	—

Gemini CLI at 1,000 free requests/day is the story here. For many developers, this is effectively unlimited. If you're budget-constrained or evaluating, start here.

Qwen Code's free API is underappreciated. Alibaba is subsidizing it for market share — take advantage while it lasts.

Tier 3: Open Source BYOK (Free Tool, Bring Your API Key)

The largest category. Zero subscription — you pay only for model inference via API keys:

Tool	GitHub Stars	Type	Model Support	What Makes It Different
OpenCode	140K+	CLI	75+ providers	Universal adapter. If a model exists, OpenCode supports it.
Aider	39K+	CLI	Any (inc. local)	Git-native. Auto-commits. Most mature. 4.1M installs, 15B tokens/week.
Cline	— (5M installs)	VS Code ext	Any	Most adopted open-source coding extension.
Continue.dev	26K	IDE ext	Any	Only tool with full VS Code + JetBrains support.
Goose	—	CLI + Desktop	Any + MCP	Block/Square's agent. Apache 2.0. Native MCP integration.
Roo Code	—	VS Code ext	Any	"When other agents break down" — reputation for reliability on large multi-file changes.
OpenClaw	—	CLI	GLM, MiniMax, Qwen, etc	Gateway to Chinese model ecosystem.
Zed	—	Editor	BYOK	Rust-native. Fastest editor in the category.
iFlow	—	CLI	Any OpenAI-compatible	SubAgents. Controlled file permissions.
Kimi Code CLI	—	CLI	Kimi K2.5	Moonshot's agent. 100-agent swarm capability.
BLACKBOX	—	Multi	Proprietary + BYOK	Completions + chat + search.

The real cost of BYOK: With Claude Sonnet at $3/$15 per million tokens, moderate daily use runs $10-15/month. With OpenRouter, you can compare prices across 100+ models. With local models, the cost is $0.

Aider remains the gold standard for terminal pair-programming. Git-native workflows, clean commit history, works with everything from GPT to local Ollama models.

Tier 4: Truly Local (Offline, Self-Hosted, Zero Cloud)

For the privacy-conscious, air-gapped environments, or anyone who wants zero recurring costs:

Inference Runtimes

Runtime	Best For	Effort	Speed
Ollama	Easiest start. One command: `ollama pull qwen2.5-coder`	Minimal	Good
llama.cpp	Maximum control. Custom compilation for your exact hardware.	High	Best (tuned)
LM Studio	Visual model management. Side-by-side comparison. GUI sliders.	Minimal	Good
vLLM	Production serving. PagedAttention cuts memory 50%+. 2-4x throughput.	Medium	Production-grade
Tabby	Self-hosted copilot. Full IDE integration on your own infra.	Medium	Good

Best Local Coding Models (April 2026)

Model	Params	SWE-bench	License	Runs On
GLM-5 (Zhipu)	744B MoE (40B active)	77.8%	MIT	vLLM / llama.cpp (needs 80GB+ VRAM for full)
Kimi K2.5 (Moonshot)	1T MoE	76.8%	Open	Similar — enterprise hardware
Devstral 2 (Mistral)	—	—	Apache 2.0	Ollama, llama.cpp
Qwen 2.5 Coder (Alibaba)	7B-72B	—	Apache 2.0	Ollama (7B on laptop, 32B on desktop)
MiniMax M2	230B MoE (10B active)	—	Open	8% of Claude's price, 2x speed
DeepSeek Coder V2	Various	—	MIT	Ollama, llama.cpp

For a laptop: Qwen 2.5 Coder 7B or DeepSeek Coder V2 7B via Ollama. Runs fine on 16GB RAM.

For a desktop with GPU: Qwen 2.5 Coder 32B via Ollama. Excellent quality, runs on RTX 3060 12GB.

For a server: GLM-5 or Kimi K2.5 via vLLM. These compete with Claude on coding benchmarks.

Tier 5: Model Routers (Connect Anything to Anything)

Router	What It Does
9router	Connects 40+ providers to Claude Code, Cursor, Copilot, Antigravity, etc.
CLIProxyAPI	Wraps Gemini CLI, Codex, Claude Code as OpenAI-compatible API. Use free Gemini models through any tool.
OpenRouter	Universal API gateway. Compare prices across 100+ models. Pay-per-token.

CLIProxyAPI is wild: it wraps Gemini CLI's free tier as an OpenAI-compatible API, which means you can use Gemini 2.5 Pro through Aider, Cline, or any OpenAI-compatible tool — for free.

Quick Decision Matrix

If you want...	Use this
Best capability, cost be damned	Claude Code (Max)
Best free experience	Gemini CLI
Best open-source CLI	Aider
Best IDE experience	Cursor
Best for teams	Continue.dev (VS Code + JetBrains)
Zero cloud dependency	Ollama + Qwen 2.5 Coder
Best Chinese model access	OpenClaw
Planning before coding	Kiro
Git-native workflows	Aider
Parallel agents	Antigravity or Windsurf

Next in this series: *Part 2 — Running AI Coding Agents for Free: The Open Source & Local Guide** — deep dive into BYOK setups, local model configuration, and getting Claude-level performance without a subscription.*

Also: *Part 3 — What Every AI Coding Tool Gets Wrong** — the measurement gap that none of these tools address.*

This is a sequel to The best (free - cheap) AI friendly Cli and Coding environments.

Top comments (4)

Harjot Singh • May 31

Genuinely useful map - bookmarking this. The thing that jumps out comparing 30+ of these is how much they converge: almost all of them are an interface to a coding loop (read, edit, run, repeat) and the real differentiation is the harness around the model, not the model itself. Two tools on the same Claude/GPT backend can feel worlds apart purely based on context management, tool design, and how they handle failure/retries.

The axis I wish more of these comparisons had: where does each one stop? Most CLIs end at "code on your machine" or "PR opened" - they're coding assistants, not shipping pipelines. That's the gap I built Moonshift into (a multi-agent pipeline that takes a prompt all the way to a deployed SaaS on your own GitHub+Vercel) - the goal isn't to win the autocomplete race, it's to cover the boring 20% (auth, billing, deploy) and end at "launched," with multi-model routing keeping a full build ~$3 flat. Different category than most of this list, but adjacent. If you ever do a v2, a "stops at: editor / PR / deploy / live app" column would be gold - it's the dimension that actually maps to how far the tool carries you.

David Van Assche (S.L) • Jun 21

Yah, thanks, I'm about to do another version of this... the cli environments move quickly and where we had 5 last year, and 30 this year, in June we now have 100s, so need to split between the most useful and split between tech and not tech harnesses.

Also Desktop harnesses would be another thing to publish and compare...

Charles Valerio Howlader • Apr 17

Thanks for sharing!

Salik Ahmad • May 29

thanks for sharing