Gemini 2.0 Pro

Name: Gemini 2.0 Pro
Author: Google

GoogleProprietaryVerified

Google's best model yet for coding performance and complex prompts, with better understanding and reasoning of world knowledge than any previous release. Features a massive 2 million token context window and the ability to call tools like Google Search and code execution.

2025-02-05

Mixture-of-Experts Multimodal Transformer

Proprietary

Compare with other models

Specifications

Architecture: Mixture-of-Experts Multimodal Transformer
License: Proprietary
Context Window: 2,000,000 tokens
Type: multimodal
Modalities: textimagevideoaudio

Benchmark Scores

BIRD-SQL59.3

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) is a cross-domain dataset ...

CoVoST 240.6

CoVoST 2 is an open, large-scale multilingual speech-to-text translation (ST) dataset developed to a...

EgoSchema71.9

EgoSchema is a very long-form video question-answering dataset and benchmark for evaluating long vid...

FACTS Grounding82.8

The FACTS Grounding Leaderboard evaluates LLMs' ability to generate factually accurate long-form res...

Global-MMLU-Lite86.5

A balanced collection of culturally sensitive and culturally agnostic MMLU tasks designed for effici...

GPQA64.7

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

HiddenMath65.2

Google's internal holdout set of competition math problems...

LiveCodeBench-v536

Evaluates models on their ability to solve coding problems in real-time....

MATH91.8

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

MMLU-Pro79.1

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

MMMU72.7

A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...

Michelangelo Long-Context Reasoning (1M)74.7

MRCR (Multi-Round Coreference Resolution) is part of the Michelangelo benchmark suite that evaluates...

SimpleQA44.3

A benchmark of simple but precise questions to test factual knowledge and reasoning....

Advanced Specifications

Model Family: Gemini
API Access: Available
Chat Interface: Available
Multilingual Support: Yes

Capabilities & Limitations

Capabilities: superior coding performancecomplex prompt handlingadvanced reasoningworld knowledgetool uselong context
Tool Use Support: Yes

Resources

deepmind.google/models/gemini/pro/

Related Models

FunctionGemma

Google

FunctionGemma is a specialized version of Gemma 3 270M fine-tuned for function calling and designed to run on edge devices. It bridges natural language and software execution, translating user commands into executable API actions. The model excels at unified action and chat capabilities, switching seamlessly between generating structured function calls and conversational responses. Built specifically for customization through fine-tuning, it demonstrated 85% accuracy on Mobile Actions after training (up from 58% baseline). Small enough to run on mobile phones and edge devices like NVIDIA Jetson Nano, it uses Gemma's 256k vocabulary to efficiently tokenize JSON and multilingual inputs.

Gemini 3 Pro

Google

Gemini 3 Pro is Google's flagship multimodal foundation model, released in November 2025. Built on a sparse Mixture-of-Experts (MoE) Transformer architecture, it features a 1 million token context window and native understanding of text, images, audio, and video. The model introduces 'Deep Think' capabilities for enhanced reasoning, controlled via a 'thinking_level' parameter, and is optimized for 'agentic' workflows and 'vibe coding'—the generation of full applications from natural language. It supports advanced function calling and tool use, making it suitable for complex software engineering and long-context analysis tasks.

Typemultimodal

ParametersProprietary

2025-11-18

Proprietary

Details Compare

Gemini 2.5 Flash-Lite

Google

Google's most cost-efficient and fastest 2.5 model yet. Higher quality than 2.0 Flash-Lite on coding, math, science, reasoning and multimodal benchmarks. Excels at high-volume, latency-sensitive tasks like translation and classification.