ironbyte-rgb for crescevo

Posted on Jun 26 • Originally published at ai.crescevo.com

Gemini 3.5 Pro and the Announcement-to-Shipping Gap Costing Google

#ai #machinelearning #llm #programming

TL;DR

Gemini 3.5 Pro , announced at Google I/O on May 19, 2026 , entered its general-availability window on June 23, but as of the latest reporting it remains in limited preview for Vertex AI enterprise customers only, not the public Gemini app or AI Studio.
Headline spec: a 2-million-token context window , the largest in any production frontier model, reportedly ~10x GPT-5 and ~16x Claude's current limit.
Deep Think, its extended-reasoning mode, is gated to Google's $250/month Ultra tier , the most expensive consumer AI subscription on the market.
Pricing (~$15/$60 per million tokens) and benchmark gains remain Google's claims, not verified; prediction markets put the odds of a public release by June 30 at only ~50-55%.

Google's next flagship is technically "launching" this week, and that's exactly the problem. Gemini 3.5 Pro has genuinely impressive announced specs , a 2M-token context window no competitor matches , but it has slipped repeatedly since I/O, and the real story isn't the model. It's the widening gap between what Google announces and what Google ships.

What's actually shipping (and what isn't)

Sundar Pichai told the I/O audience on May 19 to "give us until next month." It's now late June, and Gemini 3.5 Pro is still in limited preview for select Vertex AI enterprise customers; it has not reached the public Gemini app, AI Studio, or the consumer subscription. Until it does, Gemini 3.1 Pro remains Google's GA flagship. Treat the specs below as Google's positioning , the verified numbers only arrive with the model card at true GA.

The 2-million-token context window , the one real differentiator

The standout, if it ships as described, is the context window: 2 million tokens, double Gemini 3.5 Flash and the largest deployed in any production frontier model. Reporting frames it as ~10x GPT-5's window and ~16x Claude's current production limit. This is not a benchmark-bragging-rights number , it's a capability unlock. A 2M window means reasoning over an entire codebase, a full data room, or months of conversation history in a single request, without the retrieval-and-stitching gymnastics that long-context workflows require today. For enterprise document and code use cases, that is the feature that actually matters.

Deep Think, and the $250/month question

Deep Think is Google's "think before answering" mode , a hidden-scratchpad reasoning approach similar to OpenAI's o-series. The notable part is the packaging: Deep Think is gated exclusively to the $250/month Ultra tier, making it the priciest consumer-facing AI subscription by a wide margin. Google is betting a slice of users will pay flagship-software money for frontier reasoning. Whether that tier finds an audience is its own open question.

The real story: announce-vs-ship is Google's recurring tax

Google keeps pre-announcing frontier capability it then struggles to ship on time , and competitors keep filling the gap. While 3.5 Pro sits in preview, OpenAI and Anthropic ship into general availability. The cost isn't just a slipped date; it's that "Gemini 3.5 Pro" has been a talking point for over a month with no public model behind it, which trains developers to build on what's actually available (3.1 Pro, or rivals) rather than wait. In a market where switching is cheap, the announcement-to-availability gap is where Google leaks momentum , and this launch is being read as a test of whether it can close it.

What this means for you

Don't architect on unreleased specs. The 2M window and pricing are Google's claims until the GA model card. Build on what you can call today; pilot 3.5 Pro only once it's actually in AI Studio/Vertex GA.
If long context is your bottleneck: 3.5 Pro is the one to evaluate the day it ships , whole-codebase and whole-corpus reasoning is a genuine step change worth testing on your real workloads.
Price the reasoning tier carefully. Deep Think at $250/mo is premium; validate that the quality delta justifies it for your use case before committing seats.
Hedge providers. Until Gemini's GA cadence is reliable, single-vendor dependence carries schedule risk.

Frequently asked questions

Is Gemini 3.5 Pro available now?

Not broadly. As of late June 2026 it's in limited preview for select Vertex AI enterprise customers; it has not reached the public Gemini app, AI Studio, or the consumer subscription. Gemini 3.1 Pro remains the GA flagship.

What is the 2-million-token context window good for?

Processing very large inputs , entire codebases, large document sets, long histories , in one request. Reporting puts it at roughly 10x GPT-5 and 16x Claude's current production limit, the largest in any production frontier model.

What does Deep Think cost?

It's gated to Google's Ultra tier at $250/month, the most expensive consumer AI subscription currently on the market.

Are the benchmarks and pricing confirmed?

No. Pricing (~$15/$60 per million tokens) and benchmark gains are reported expectations, not verified facts. Official numbers arrive with the GA model card.

DEV Community