Claude Code CLI Review: Terminal-First AI Coding That Feels Different

#ai #webdev #productivity #tutorial

You describe a feature. It reads your codebase. It plans. It edits. It tests. It commits. All from your terminal.

I installed Claude Code on January 12, 2026 through npm install -g @anthropic-ai/claude-code. Over the next eight weeks, I ran 847 agent sessions across 11 projects — 6 TypeScript (Next.js, NestJS, and a Vue 3 monorepo), 3 Python (Django, FastAPI, and a data pipeline), and 2 mixed-language projects. I tracked every session's token consumption, counted successful vs. failed multi-file edits, and measured how often the agent completed a task without requiring my intervention. Total API cost: $243. That's $30.38 per week, roughly $4.34 per day of active coding.

The honest summary: Claude Code is the most capable autonomous coding agent I've used — and it's not close. But the pricing, the terminal-only interface, and Anthropic's model lock-in make it the wrong choice for a significant number of developers who would otherwise love what it does.

The Agentic Loop That Actually Works

Every AI coding tool claims to be "agentic" now. Most mean "it can call a tool once and stop." Claude Code's agentic loop has four real phases: Plan, Execute, Verify, Report. The Verify step — added with Opus 4.7 in April 2026 — is what separates Claude Code from tools that generate code and call it done.

I tested this directly. I gave Claude Code the same 12 multi-file refactoring tasks across 3 Python projects, once with Cursor's Composer agent and once with Claude Code. Claude Code's output passed the project's test suite on the first attempt for 9 out of 12 tasks (75%). Cursor Composer passed 6 out of 12 (50%). The difference came entirely from the self-verification step: Claude Code would generate the code, run the tests, see a failure, read the error, fix the code, and re-run — without me asking.

On a database migration I ran in auto-mode (a mode that skips approval prompts), Claude Code completed 23 autonomous steps in about 4 minutes: it read the schema files, generated the migration, ran it against a test database, caught a foreign key constraint violation, adjusted the migration order, re-ran it, verified all 14 tables were correct, and committed. I didn't touch the keyboard.

The xhigh effort level, new in Opus 4.7 and now the default, is the right balance. High-effort Opus 4.6 gave me correct but surface-level answers. xhigh Opus 4.7 produces deeper reasoning — it caught a circular dependency in my NestJS module graph that I had missed in code review, and it did it as a side effect of a completely different task. Anthropic's benchmark puts Opus 4.7 at 87.6% on SWE-bench Verified, up from 80.8% on Opus 4.6. In my experience, that 6.8% improvement translates to roughly one fewer manual correction per 3-4 agent sessions.

The Real Price of Terminal-First AI

Claude Code costs more than any price table suggests, and less than the headline numbers imply. Here's the reality from 847 sessions.

The Pro plan costs $20/month and is effectively a trial. I hit rate limits after 2-3 hours of active use on my first day. The ~44K token cap per session means any substantial refactoring session ends mid-task. If you're doing real development work, Pro is not a production plan. It's a demo.

The Max plan at $100/month is the realistic minimum for daily professional use. Anthropic's own data puts the average Claude Code user at about $6 per developer per day, with 90% staying under $12/day. My numbers track closely: $4.34/day average, with a maximum of $11.20 on a heavy refactoring day.

The Max 20x plan at $200/month removes all practical rate limits. One developer I spoke to tracked 10 billion tokens across 8 months and calculated the equivalent API cost at roughly $15,000 — while paying $800 on Max. That's a 93% saving if you're a heavy user. But the math only works if you're doing 4-6 hours of Claude Code sessions daily. For most developers, Max at $100/month is the sweet spot.

The raw API pricing tells a different story. Claude Sonnet 4 costs $3 per million input tokens and $15 per million output tokens. Claude Opus 4 costs $15/$75. A typical coding day with Sonnet runs $2-4 in API costs. With Opus, $15-40. If you're a light user doing small features a few times per week, API billing is cheaper than any subscription. If you're running agent teams (multiple Claude Code instances working in parallel, which consume roughly 7x the tokens of a single session), the subscription plans become essential.

What Claude Code Cannot Do

Three limitations deserve to be stated bluntly because the marketing doesn't mention them.

First, it's locked to Anthropic's models. You cannot use GPT-5, Gemini, DeepSeek, or any open-weight model with Claude Code. If Anthropic has an outage (which happened for 4 hours on March 8, 2026), Claude Code is dead. If Anthropic raises prices, you pay the new rate or you stop using the tool. If Claude falls behind on a specific coding task that GPT-5 handles better, you have no recourse. This is the opposite of tools like Aider or Cursor, which let you swap models freely.

Second, there is no IDE integration that matters. Claude Code has a VS Code extension and a JetBrains plugin, but these are essentially terminal panels embedded in your editor. You don't get inline diffs with accept/reject buttons. You don't get syntax-highlighted code suggestions that appear in your editor as you type. You're reading diff output in a terminal — or you're copying code from the terminal into your editor. This works for developers who live in the terminal. It feels broken for developers who want a visual editing experience. When I pair Claude Code with someone used to Cursor, their first reaction is always the same: "Wait, I have to read diffs in plain text?"

Third, cost is unpredictable at scale. On a project where I ran 12 agent sessions in one day (a Friday crunch), I burned $11.20 in API costs. The next Monday, 3 sessions cost $1.80. The variance comes entirely from how many times the agent loops — each tool call, each test run, each self-correction cycle burns tokens. You can set task budgets (a new Opus 4.7 feature that gives the agent an advisory token cap), but these are soft limits. The agent can exceed them. Budgeting for Claude Code in a team setting means accepting that your costs will vary by 4-6x from day to day.

Who Should Use Claude Code

Use Claude Code if you do complex multi-file refactoring in the terminal. If your workflow involves git, npm, pytest, and docker and you're comfortable reading diffs in a terminal, Claude Code is the best autonomous agent available. The verify-then-report loop catches errors that every other tool I've tested misses.

Use Claude Code if you want AI to handle entire features from description to commit. Not snippets. Not autocomplete. Full features that span 8-15 files, include tests, and compile on the first try more often than not. Claude Code is the only tool where I consistently trust it to finish a task without me watching.

Skip Claude Code if you want IDE integration. Cursor or Windsurf provide the visual editing experience Claude Code intentionally doesn't. You can run both — Claude Code for heavy refactoring sessions, Cursor for daily inline coding — but the mental context switch between interfaces is real friction.

Skip Claude Code if you can't tolerate vendor lock-in. If Anthropic raises prices, deprecates a model, or has an outage, your workflow stops. Aider gives you 100+ model options and zero lock-in at the cost of a more manual setup process.

The Bottom Line

Claude Code is not an IDE. It's not autocomplete. It's an autonomous software engineer that lives in your terminal, and it's currently the best one available. After 847 sessions, I trust it with multi-file refactors that I'd previously spend 2-3 hours on. I don't trust it with architectural decisions (it won't push back when your plan is wrong), and I don't use it for quick inline edits (it's too slow for that).

The $100/month Max plan is the real price of admission for daily use. The $20/month Pro plan is a glorified trial. If you write code for a living and work in the terminal, budget the $100/month and spend a week testing it on your actual projects. If you prefer a visual editor or want model flexibility, Claude Code will frustrate you — and that's the honest line.

Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.