ARC
reasoningPending Verification
AI2 Reasoning Challenge (ARC) tests reasoning through grade-school science questions.
Published: 2018
Score Range: 0-100
Top Score: 96.4
ARC Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | Claude 3 Opus | Anthropic | 96.4 | 2024-03-04 | Multimodal | |
2 | Claude 3 Sonnet | Anthropic | 93.2 | 2024-03-04 | Multimodal | |
3 | Claude 3 Haiku | Anthropic | 89.2 | 2024-03-04 | Multimodal | |
4 | Mixtral 8×22B | Mistral AI | 70 | 141B (39B active) | 2024-04-17 | Text |
About ARC
Methodology
ARC evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.
Publication
This benchmark was published in 2018.Technical Paper