DiffusionGemma 26B & NVIDIA Cosmos 3: The Two Open-Source Models Reshaping AI This Week
Two massive open-source model drops just landed, and they couldn't be more different — one brings diffusion-powered LLM inference to your gaming GPU, the other teaches robots to understand the physical world.
🧠 Google DiffusionGemma 26B-A4B — Local AI, 4x Faster
Google DeepMind dropped DiffusionGemma 26B-A4B this week, and it's a genuine paradigm shift. This is a 26-billion parameter Mixture-of-Experts model that activates only 3.8B parameters per token — meaning it fits comfortably on a consumer NVIDIA RTX 5090 or 4090 with quantization.
The real magic? It uses diffusion-based parallel decoding instead of standard autoregressive generation. The result is 4x faster inference on local hardware, with built-in self-correction and bidirectional context understanding. It's multimodal too — processing text, images, and video inputs.
Available now on Hugging Face under an open license. This is the model that makes "run everything locally" actually practical.
🤖 NVIDIA Cosmos 3 — The First Open Omnimodal Physical AI
At Computex Taipei, NVIDIA unveiled Cosmos 3, the first open-source omnimodal world model for physical AI. It jointly processes and generates language, image, video, audio, and action sequences in a unified mixture-of-transformers architecture.
Cosmos 3 ranks #1 among open models on Robotics, Smart Space, and Driving benchmarks. It can generate robot training data, simulate real-world physics, and plan actions — all in one model. Think of it as a foundation model that doesn't just talk about the world, but understands how it moves.
Both models are open-weight and available now. Whether you're building the next local AI assistant or training a warehouse robot, this week's releases just gave you the tools.
Links:
What are you building with these? Drop a comment below.

Top comments (0)