Apple using custom 1.2T-parameter Google model for Siri, per Reuters. Model larger than Gemini 3.5 Flash's 300B parameters; simple queries run locally.
Apple is reportedly using a custom 1.2T-parameter Google model for Siri, per Reuters. The model, significantly larger than Gemini 3.5 Flash's 300B parameters, will power parts of the next Siri overhaul.
Key facts
- Apple using custom 1.2T-parameter Google model for Siri.
- Gemini 3.5 Flash estimated at 300 billion parameters.
- Simple queries expected to run locally on device.
- Reported by Reuters via @kimmonismus.
- Next Siri overhaul expected at WWDC 2026.
Apple is not merely adding Gemini to Siri—it is reportedly using a custom 1.2T-parameter Google model as the brain behind parts of the next Siri overhaul, according to Reuters. This model is substantially larger than Gemini 3.5 Flash, which is estimated to have around 300 billion parameters.
The Size vs. Speed Trade-Off
The 1.2T parameter count raises immediate questions about performance and latency. Apple's model must deliver answers to everyday queries quickly and be fast enough while doing so. Simple queries are expected to run locally on the device, which would require efficient on-device inference—a non-trivial challenge for a model of this scale.
Unique Take: Apple's Strategic Bet on Third-Party Models
The unique angle here is not just that Apple is using a Google model, but that it is deploying a custom, massive model for a consumer-facing assistant. This marks a departure from Apple's historical preference for smaller, on-device models like the 3B-parameter models used in earlier Apple Intelligence features. The 1.2T parameter count suggests Apple is prioritizing capability over latency, at least for server-side queries, and betting that Google's architecture can deliver both speed and accuracy.
Implications for the Assistant Market
This move positions Siri to compete more aggressively with standalone AI assistants like ChatGPT and Claude. The custom Google model could give Siri a significant edge in tasks requiring deep reasoning or broad knowledge, while local handling of simple queries preserves privacy and responsiveness. However, the success hinges on whether the model can run fast enough for real-time interaction—a known pain point for large models in production.
What's Next
Apple is expected to unveil more details at WWDC, likely in June 2026. The next months will also bring GPT-5.6, Sonnet 4.8/Opus 4.8, and Gemini 3.5 Pro, creating a competitive landscape for assistant technology.
What to watch
Watch for WWDC 2026 in June for official details on Siri's capabilities and latency benchmarks. Also track whether Apple discloses the model's performance on standard assistant benchmarks like MMLU or GSM8K.
Originally published on gentic.news
Top comments (0)