Grok-2
xAI's Grok-2 is a significant step forward from Grok-1.5, featuring frontier capabilities in chat, coding, and reasoning. It outperformed Claude 3.5 Sonnet and GPT-4-Turbo on the LMSYS leaderboard and shows significant improvements in reasoning with retrieved content and tool use capabilities.
Specifications
- Parameters
- Unknown
- Architecture
- Decoder-only Transformer
- License
- Proprietary
- Context Window
- 8,192 tokens
- Type
- multimodal
- Modalities
- textimage
Benchmark Scores
Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...
Evaluates mathematical reasoning in visual contexts, combining vision and mathematical problem-solvi...
Advanced Specifications
- Model Family
- Grok
- API Access
- Available
- Chat Interface
- Available
Capabilities & Limitations
- Capabilities
- chatcodingreasoningvisiontool-use