Grok 3
xAI's most advanced model yet, blending superior reasoning with extensive pretraining knowledge. Trained on the Colossus supercluster with 10x the compute of previous state-of-the-art models. Features test-time compute and reasoning capabilities through reinforcement learning, allowing it to think for seconds to minutes while correcting errors and exploring alternatives. Achieved an Elo score of 1402 in the Chatbot Arena.
Specifications
- Parameters
- Unknown (multi-trillion estimated)
- Architecture
- Decoder-only Transformer
- License
- Proprietary
- Context Window
- 1,000,000 tokens
- Type
- multimodal
- Modalities
- textimagevideo
Benchmark Scores
Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
Long-Context Frontiers benchmark evaluating long-context language models on real-world tasks requiri...
A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...
EgoSchema is a very long-form video question-answering dataset and benchmark for evaluating long vid...
The FACTS Grounding Leaderboard evaluates LLMs' ability to generate factually accurate long-form res...
Advanced Specifications
- Model Family
- Grok
- API Access
- Available
- Chat Interface
- Available
- Variants
- Grok 3 (Think)Grok 3 miniGrok 3 mini (Think)
Capabilities & Limitations
- Capabilities
- reasoningmathematicscodingtest-time-computechain-of-thoughttool-useagentsmultimodal-understanding
- Notable Use Cases
- advanced reasoningmathematical problem solvingcode generationresearch assistanceagent workflows
- Function Calling Support
- Yes
- Tool Use Support
- Yes