Gemini Diffusion

Name: Gemini Diffusion
Author: Google

GoogleProprietaryVerified

Google's state-of-the-art, experimental text diffusion model that uses diffusion techniques to explore a new kind of language model that gives users greater control, creativity, and speed in text generation. Unlike traditional autoregressive models that generate text sequentially, diffusion models refine noise step-by-step to generate entire blocks of tokens at once.

2025-05-20

Text Diffusion Model

Proprietary

Compare with other models

Specifications

Architecture: Text Diffusion Model
License: Proprietary
Context Window: 32,000 tokens
Type: text
Modalities: text

Benchmark Scores

AIME-202523.3

American Invitational Mathematics Examination (AIME) 2025 problems....

BIG-bench15

Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark of 204 diverse tasks....

GPQA40.4

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

HumanEval89.6

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

MMLU69.1

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

SWE-bench22.9

Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...

view all (+4)

Advanced Specifications

Model Family: Gemini
API Access: Not Available
Chat Interface: Not Available
Multilingual Support: Yes

Capabilities & Limitations

Capabilities: rapid responsemore coherent textiterative refinementerror correctionparallel generationcodemathreasoningeditinghigh-speed generation
Known Limitations: experimental statuswaitlist access onlylimited availability
Notable Use Cases: text editingcode generation and editingmathematical problem solvingrapid content generationiterative text refinement

Resources

deepmind.google/models/gemini-diffusion/

Related Models

FunctionGemma

Google

FunctionGemma is a specialized version of Gemma 3 270M fine-tuned for function calling and designed to run on edge devices. It bridges natural language and software execution, translating user commands into executable API actions. The model excels at unified action and chat capabilities, switching seamlessly between generating structured function calls and conversational responses. Built specifically for customization through fine-tuning, it demonstrated 85% accuracy on Mobile Actions after training (up from 58% baseline). Small enough to run on mobile phones and edge devices like NVIDIA Jetson Nano, it uses Gemma's 256k vocabulary to efficiently tokenize JSON and multilingual inputs.

Gemini 3 Pro

Google

Gemini 3 Pro is Google's flagship multimodal foundation model, released in November 2025. Built on a sparse Mixture-of-Experts (MoE) Transformer architecture, it features a 1 million token context window and native understanding of text, images, audio, and video. The model introduces 'Deep Think' capabilities for enhanced reasoning, controlled via a 'thinking_level' parameter, and is optimized for 'agentic' workflows and 'vibe coding'—the generation of full applications from natural language. It supports advanced function calling and tool use, making it suitable for complex software engineering and long-context analysis tasks.

Typemultimodal

ParametersProprietary

2025-11-18

Proprietary

Details Compare

Gemini 2.5 Flash-Lite

Google

Google's most cost-efficient and fastest 2.5 model yet. Higher quality than 2.0 Flash-Lite on coding, math, science, reasoning and multimodal benchmarks. Excels at high-volume, latency-sensitive tasks like translation and classification.