AIME-2024
mathematicsPending Verification
American Invitational Mathematics Examination (AIME) 2024 problems.
Published: 2024
Score Range: 0-100
Top Score: 96.6
AIME-2024 Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | GPT-OSS-120B | OpenAI | 96.6 | 117B total (5.1B active per token) | 2025-08-05 | Text |
2 | GPT-OSS-20B | OpenAI | 96 | 21B total (3.6B active per token) | 2025-08-05 | Text |
3 | Grok 3 Mini | xAI | 95.8 | Unknown | 2025-02-19 | Multimodal |
4 | o4-mini | OpenAI | 93.4 | 2025-04-16 | Multimodal | |
5 | Qwen-3 | Alibaba | 85.7 | 235B (22B active) | 2025-04-29 | Text |
6 | o1 | OpenAI | 83.3 | 2024-09-12 | Multimodal | |
7 | Claude 3.7 Sonnet | Anthropic | 80 | 2025-02-24 | Multimodal | |
8 | DeepSeek-R1 | DeepSeek | 79.8 | 671B (37B activated) | 2025-01-20 | Text |
9 | Kimi K2 | Moonshot AI | 69.6 | 1T total, 32B activated | 2025-07-11 | Text |
10 | Grok 3 | xAI | 52.2 | Unknown (multi-trillion estimated) | 2025-02-19 | Multimodal |
About AIME-2024
Methodology
AIME-2024 evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.
Publication
This benchmark was published in 2024.Technical Paper