A month of switching between Claude Opus 4.7 and GPT-5.4 for actual work

#ai #devtools #productivity #programming

I didn't set out to do a comparison. I just had both open because work pays for one and I pay for the other, and after a month of bouncing between them on real tickets I had enough of an opinion that writing it down felt worth it. This isn't a benchmark post. Benchmarks are everywhere and they rarely match how a tool feels at 4pm on a Friday when you're three layers deep in someone else's code.

Where Opus 4.7 won for me

Long context that stays coherent. I threw a genuinely ugly refactor at it — a service file that had grown past 1,200 lines with three different people's conventions tangled together — and it held the whole thing in its head without losing track of what it had already changed. That's the part that's hard to capture in a leaderboard. GPT-5.4 is sharp, but a couple of times it "fixed" something it had introduced two messages earlier, which Opus didn't do as often.

It's also better at saying "this is risky" instead of just doing the thing. When I asked it to migrate a database column it actually flagged the backfill problem before I hit it in staging. Small thing, saved me an evening.

Where GPT-5.4 won

Speed and first-draft code. For greenfield stuff — write me a parser, scaffold this endpoint — GPT-5.4 got me to a runnable draft faster and the code was usually closer to idiomatic on the first pass. If I'm prototyping and I'll throw the code away anyway, I reach for it.

It's also less verbose by default. Opus likes to explain. Sometimes I want the explanation, often I just want the diff.

The honest annoyances

Both still hallucinate library APIs for anything that isn't extremely popular. I lost time to a confidently-wrong import from each of them in the same week. Neither one is a substitute for reading the actual docs when you're using a niche package.

And cost is real. The deeper-reasoning runs add up faster than you'd guess if you're iterating a lot, and it's easy to not notice until the bill.

What I actually landed on

I stopped trying to pick a winner. Opus for the gnarly, stateful, "don't break the existing system" work. GPT-5.4 for fast drafts and throwaway prototyping. The switching cost is low once you stop treating it as a loyalty decision.

If you've only ever used one because it's what your company provisioned, it's genuinely worth running the other for a week on your own tasks. The gap between them is smaller than the marketing implies, but the shape of where each is strong is different enough to matter.

Curious what other people have settled on — especially anyone doing heavier agent / multi-file work than I am.

If you want the longer version, I wrote up the full side-by-side — cost per task, where each model actually wins, the benchmark caveats — over at OpenAI Tools Hub.

DEV Community

A month of switching between Claude Opus 4.7 and GPT-5.4 for actual work

Where Opus 4.7 won for me

Where GPT-5.4 won

The honest annoyances

What I actually landed on

Top comments (0)