Gemma 3
Gemma 3 is a collection of lightweight, state-of-the-art open models built from the same research and technology that powers Gemini 2.0 models. The most capable model you can run on a single GPU or TPU, delivering state-of-the-art performance for its size and outperforming Llama3-405B, DeepSeek-V3 and o3-mini in preliminary human preference evaluations on LMArena's leaderboard. Features advanced text and visual reasoning capabilities, 128K token context window, function calling, and support for over 140 languages.
Specifications
- Parameters
- 1B, 4B, 12B, 27B
- Architecture
- Decoder-only Transformer
- License
- Apache-2.0
- Context Window
- 128,000 tokens
- Type
- multimodal
- Modalities
- textimagevideo
Benchmark Scores
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
Tests common sense natural language inference through completion of scenarios....
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
Measures a model's tendency to reproduce falsehoods commonly believed by humans....
Discrete Reasoning Over Paragraphs (DROP) requires models to resolve references in a passage and per...
A multilingual evaluation set spanning 42 languages that combines machine translations for MMLU ques...
Advanced Specifications
- Model Family
- Gemma
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Variants
- Full 32bitBF16 (16-bit)SFP8 (8-bit)Q4_0 (4-bit)INT4 (4-bit)
- Hardware Support
- CUDATPUCPUMetalNVIDIA H100NVIDIA Jetson NanoNVIDIA BlackwellAMD ROCmGoogle Cloud TPU
Capabilities & Limitations
- Capabilities
- question answeringsummarizationreasoningmultimodal understandingfunction callingstructured outputcode generationimage analysisvideo analysistext extractionobject identificationlarge context processingSTEM reasoningagentic experiencesworkflow automation
- Known Limitations
- 1B model is text-only without image supportLower parameter models have reduced capabilitiesPerformance varies with quantization level
- Notable Use Cases
- single GPU/TPU deploymenton-device AI applicationsquestion answering systemsdocument summarizationvisual content analysisvideo analysisprogramming interfacesmultilingual applicationslarge document processingAI-driven workflowsagentic applicationsimage safety checking
- Function Calling Support
- Yes
- Tool Use Support
- Yes