Skip to main content
← Back to sources

Google Releases DiffusionGemma, an Open-Weight Diffusion Text Model at 500+ Tokens/Second

Published 2026-06-10Foundation ModelsMedium⭐ Timeline Candidate

Summary

Google released DiffusionGemma (`google/diffusiongemma-26B-A4B-it`) under an Apache 2.0 license — an open-weight productization of its previously shelved Gemini Diffusion research. Simon Willison measured at least 500 tokens/second running through NVIDIA's free NIM cloud API (the experimental version had peaked around 857 tok/s). Diffusion-based text generation is notable because it produces tokens in parallel rather than strictly left-to-right, which can deliver dramatically lower latency for c

Alignment: New signal not yet covered
Related Positions: multi-model-multi-vendor, ai-infrastructure-strategy
googlediffusiongemmaopen-weightsdiffusion-modelinference-speedapache-2nvidia-nimgemma
Google Releases DiffusionGemma, an Open-Weight Diffusion Text Model at 500+ Tokens/Second — Intelligence — Agentic Developer Tools Radar · Signal