DEV Community

Discussion on: Gemini 2.5 Flash vs Claude 3.7 Sonnet: 4 Production Constraints That Made the Decision for Me

Collapse
 
tommy_leonhardsen_81d1f4e profile image
Tommy Leonhardsen

The JSON stability comparison isn't really Gemini vs Claude — it's Gemini's responseSchema vs prompted Claude. That's not a fair test.

Claude's tool use and structured outputs enforce schema at the decoding layer exactly like responseSchema does. A simple tool definition with your campaign schema and tool_choice: {type: "tool", name: "generate_campaign"} gets you to 100% parse reliability, same guarantee, no middleware.

The PDF claim also doesn't hold up. Claude has supported native base64 PDF ingestion directly in the Messages API for a long time — no OCR step, no external service. Same inlineData pattern, different field names.

Latency and cost comparisons are legitimate and Gemini Flash wins those on current pricing. But two of your four constraints were implementation gaps, not model limitations.

Also worth noting: Claude 3.7 Sonnet is over a year old. Claude Sonnet 4.6 and Haiku 4.5 are both faster, cheaper and better than any 3.7 model was. The latency and cost numbers in this article were already outdated when it was published.

Collapse
 
dumebii profile image
Dumebi Okolo

You're right on both counts, and I will be honest about it.

On JSON stability: the framing was I'm precise and a bit clickbaity.
The comparison was
Gemini with responseSchema vs Claude with prompted JSON — not
Gemini vs Claude's structured output ceiling. Claude's tool use
with tool_choice enforcement gets you to the same structural
guarantee at the decoding layer. The article notes this but buries
it in a footnote when it should have been the headline caveat.
I take responsibility for that.

On PDF ingestion: you're correct, and I got this entirely wrong. Claude's
Messages API does support native base64 PDF ingestion via the
document content block — no OCR preprocessing required. The
"5 steps, 2 additional failure points" claim was based on a
misread of the integration path and shouldn't have made it into
the article as written. I'll update that section.

On model choice: I was already inside the Vertex AI ecosystem and
evaluated models available in the Google Model Garden. Claude 3.7
Sonnet was the version accessible there at the time. But I agree the
article should have stated that more explicitly rather than framing
it as a general Gemini vs Claude evaluation.

The latency and cost numbers are the ones I'm most confident in
because they were measured in my actual production environment
(Vercel serverless, non-streaming, same input payload). Those
comparisons hold.

Thanks for pushing back though. This is the kind of correction that
makes the article worth more than it was when I published it.
I will be revising the article.