MORINAGA

Posted on Jul 1

Why I'm betting on Claude Code over Cursor for a solo dev pipeline

#ai #claude #programming #indiehackers

I used Cursor as my primary AI coding tool from early 2025 through February 2026. In March, when I started the three-directory-site experiment I've been documenting here, I switched to Claude Code as my main driver. That switch wasn't impulsive — I tried both on the same tasks for about two weeks before committing. Here's the specific bet I'm making, the case against it that I find genuinely compelling, and the three conditions under which I'd reverse the decision.

The bet, stated plainly

By December 2026 — nine months into this project — Claude Code will have saved me more total wall-clock time on this experiment than Cursor would have, net of context-restart overhead and the overhead of not having inline tab completion. The measurement is informal but the conditions aren't vague: if I'm spending more than 20 minutes per week re-establishing context that Cursor would have kept in a sidebar chat history, I'm wrong.

I'm not claiming Claude Code is better across all developer workflows. My bet is specific to the task distribution this project has required: multi-file refactors touching five or more files simultaneously, GitHub Actions debugging where terminal output is the primary signal, automated pipeline scripts that need AI assistance at invocation time, and content-generation runs that consume the staged git diff as input.

What pushed me toward Claude Code

The immediate catalyst was the GitHub Actions CI setup. The three-site architecture runs a nightly ETL, a daily article generation job, a Bluesky queue refill, and several post-deploy checks — each as a separate workflow. Debugging those workflows from Cursor's chat mode is awkward. Cursor sees the YAML file correctly; it doesn't see the runtime logs that show exactly which step failed and what the failing bash command produced. I had to copy error output from the GitHub Actions interface into Cursor's chat manually, which breaks flow.

Claude Code sits in the terminal alongside the git output, the pnpm install errors, the node script stack traces. I can paste a failing run log directly into a session without switching context. That terminal-native loop — observe failure, invoke AI, inspect proposed fix, run command — is where I first noticed a meaningful productivity gap.

The second factor is multi-file coherence. The shared Claude Haiku client is imported by five different ETL scripts across three separate apps. Refactoring it — adding a retry parameter, changing the caching behavior — means touching all five call sites simultaneously. Claude Code can open all five files in context, reason about which call sites need updating and which don't based on usage patterns, and produce a coherent multi-file diff explanation. Cursor's "apply to multiple files" flow surfaces one diff at a time with manual approval at each step. For this specific operation — a cross-repo parameter change — I find Claude Code's approach faster.

Third is the article-generation pipeline itself. The content quality gate runs an audit script on every generated file. The routine I run uses staged git output to feed a reviewer, then optionally patches the article before committing. That whole loop — generate, stage, review, fix, re-stage, commit — runs in the terminal. Claude Code can execute bash commands, inspect what changed, and iterate without clipboard handoffs. In Cursor I'd break that flow every time I needed to check the audit output or run a pnpm script.

What I'm giving up

Cursor's inline edit mode is genuinely better for micro-changes. CMD+K opens a floating edit bar at the cursor position, accepts a one-sentence description, and shows an inline diff that accepts or rejects in under two seconds. Claude Code has no equivalent. If I want to rename a variable or flip a conditional, I describe the location in the terminal, wait for the tool to navigate there, and approve the change — objectively slower.

Tab completion is the other thing I miss. Cursor's completions predict what you're about to type in a familiar codebase with surprising accuracy. In the TypeScript ETL scripts I iterate on constantly, Cursor already knows I'm about to write await db.execute({sql: and completes the pattern including the object shape. Claude Code has no tab-complete mode; it's interactive-only.

The third gap is session continuity. Cursor's sidebar chat persists history across sessions. I can scroll back in a Cursor conversation and see the discussion that explained a design decision two weeks ago. Claude Code starts fresh on every invocation. If I'm debugging something I touched four days ago, I'm re-establishing context from git log and file reads rather than from a conversation thread that already captured the reasoning.

The counterargument I take seriously

The strongest case against my bet: Claude Code's terminal-native strength is also its ceiling. The operations where it beats Cursor — multi-file refactors, CI debugging, pipeline automation — are a minority of actual characters typed in a development session. Line-level edits, variable renames, docstring updates, quick function calls — those are the majority. Cursor handles them faster.

If the correct mental model is "80% of dev time is small edits, 20% is large operations," then optimizing for the 20% with Claude Code while taking a speed penalty on the 80% is a net loss. Cursor, covering 80% well and 20% adequately, might win on total wall-clock time even if Claude Code wins on per-operation speed for the big tasks.

I don't have tracked data to refute this cleanly. My intuition is that this project, specifically, is skewed toward the 20% end — pipeline-wide changes, new ETL integrations, debugging CI failures — more than a typical single-app product build would be. The AI directories bet has the same honest structure: I'm making a claim based on structural reasoning, not on clean measurement.

What would resolve this: tracking dev time by operation type for four weeks and comparing the two categories. That's a straightforward measurement I haven't run. If small edits consume more than two-thirds of my Claude Code sessions, the counterargument wins.

The cost structure

Both tools' monthly costs are close enough that cost alone isn't the deciding factor, but the structure matters for how I think about usage.

Cursor Pro is $20/month flat, covering unlimited completions and a monthly cap on "premium" model uses — Claude Sonnet and GPT-4o, as of this writing — with automatic fallback to a smaller model when the cap is hit. Predictable cost, opaque per-operation consumption.

Claude Code bills against an Anthropic API key directly. For my usage pattern — roughly three to five complex sessions per day on a project at this scale — the monthly API cost lands between $15 and $30 depending on session complexity. The variance comes from how often I ask for full-codebase reads versus targeted edits. It's not consistently cheaper than Cursor Pro, and it's not consistently more expensive.

The meaningful difference is visibility. With Claude Code I can see exactly what each session consumed. With Cursor Pro I don't know whether a "apply to 12 files" operation used one premium credit or ten. For a project where I'm tracking every dollar of infrastructure cost, per-operation visibility changes how I think about usage patterns.

How I partition the workflow in practice

I use Claude Code for anything that starts with a problem statement spanning multiple files: "the ETL is writing duplicate entries — find where the upsert logic lives and figure out why it's firing twice for the same ID." Terminal access plus file-reading plus bash execution in one context is worth the micro-edit tradeoff for that class of problem.

For genuinely small edits — a Tailwind class adjustment, a typo in a component — I open the file directly in VSCode and edit manually. No AI involved. That's faster than either tool for operations with clear, exact solutions.

What I've essentially done is partition my editing: Claude Code for architectural operations, direct editing for surgical fixes, no AI for trivial changes. Cursor was attempting to cover all three categories; the result was friction at both ends because the tool can't optimize simultaneously for "large autonomous operation" and "two-keystroke inline fix."

The static site rendering choice followed similar reasoning: picking one approach for a specific constraint set rather than picking the tool that's most general. I'm applying the same thinking to development tooling.

What would change my mind

Three signals would push me back to Cursor.

Context loss compounds past a threshold. If I find myself spending more than 20 minutes per week re-explaining architectural decisions to fresh Claude Code sessions that should have retained them, the continuity gap stops being an acceptable tradeoff. That threshold is specific enough to evaluate month-by-month.

Cursor ships terminal-native agentic mode. Cursor is actively developing agentic capabilities. If they ship a mode where Cursor executes terminal commands, observes output, and iterates without requiring IDE focus — essentially what Claude Code's bash tool does — the workflow gap I've described narrows to near zero. I'd run a direct comparison again at that point.

Task distribution shifts toward UI iteration. This project is currently infrastructure-heavy: ETL pipelines, CI workflows, cross-posting automation, the pairwise compare page generation. If it matures into mostly front-end iteration — layout experiments on the directory pages, A/B testing components — the small-edit / tab-completion advantage that Cursor holds would outweigh the pipeline operations advantage. The bet is partly a claim about what the project will continue to require.

The December 2026 checkpoint

The AI directories bet has a formal October 2026 deadline. This tooling bet is softer — I'll check in by December 2026 with whatever the data shows. I'll report: how often I hit the context-loss pain point, whether the task distribution stayed infrastructure-heavy, and whether either tool materially changed its offering. If I've switched back to Cursor by then, I'll say so with specifics.

One thing I won't do is rationalize ambiguous signals optimistically. The same commitment I made about Bluesky automation quality — systematic gates, not self-review — applies here. If the measurement says I'm wrong, I'll say I'm wrong.

FAQ

Can you use Claude Code and Cursor at the same time?

Yes, and I occasionally do. Claude Code sessions happen in a terminal window; Cursor runs in the IDE alongside. The main friction: if both are running Claude Sonnet simultaneously, they're competing for the same API rate limits. In practice I don't hit conflicts on this project's volume, but it's something to watch for longer agentic sessions.

Is Claude Code available everywhere Cursor is?

No. Cursor is a full IDE replacement available on Mac, Windows, and Linux. Claude Code is a CLI that requires a terminal. It doesn't have a native Windows GUI experience as of mid-2026, though WSL2 works. For developers primarily on Windows without WSL, this is a practical blocker.

What about GitHub Copilot — does either replace it?

Cursor includes Copilot-style tab completion with its own model backend; you don't need a separate Copilot subscription if you're using Cursor Pro. Claude Code doesn't offer tab completion. If inline completions are your primary AI coding use, Claude Code isn't a Copilot replacement — it's a different tool that doesn't try to be.

How does this tooling choice affect the pipeline automation?

Directly: the ETL scripts, GitHub Actions YAML, and article generation routines are all authored and debugged through Claude Code. Indirectly: the content quality gate and the QC review runs are shell scripts that fit naturally into a Claude Code session but would require clipboard handoffs in Cursor. The tool shapes how I build the automation, which shapes what automation I'm willing to maintain.

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.

Top comments (2)

Shoogar • Jul 2

The "falsifiable 9-month bet" framing is rare and genuinely good — most tool-switch posts are vibes, this has kill criteria and a $25/mo constraint that forces honest architecture. Curious: with five ETL scripts and nightly runs on GitHub Actions, have you hit the silent-degradation problem yet, where the pipeline keeps exiting green but the quality gate is quietly passing worse and worse output? That failure mode never shows up in CI logs, and it's the one that eventually bit my own nightly Claude pipeline.

Baran Çevik • Jul 1

AI is evolving rapidly, we will be keeping an eye on it. Thanks for article.