ARC

reasoningPending Verification

AI2 Reasoning Challenge (ARC) tests reasoning through grade-school science questions.

Published: 2018
Score Range: 0-100
Top Score: 96.4

ARC Leaderboard

RankModelProviderScoreParametersReleasedType
1Claude 3 OpusAnthropic
96.4
2024-03-04Multimodal
2Claude 3 SonnetAnthropic
93.2
2024-03-04Multimodal
3Claude 3 HaikuAnthropic
89.2
2024-03-04Multimodal
4Mixtral 8×22BMistral AI
70
141B (39B active)2024-04-17Text

About ARC

Methodology

ARC evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2018.Technical Paper

Related Benchmarks