MMLU-Pro

knowledgeVerified

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including Biology, Business, Chemistry, Computer Science, Economics, Engineering, Health, History, Law, Math, Philosophy, Physics, Psychology, and Others. It features 10 answer choices per question (vs. 4 in MMLU) and focuses on complex reasoning tasks.

Published: 2025
Score Range: 0-100
Top Score: 84

MMLU-Pro Leaderboard

RankModelProviderScoreParametersReleasedType
1DeepSeek-R1DeepSeek
84
671B (37B activated)2025-01-20Text
2Kimi K2Moonshot AI
81.1
1T total, 32B activated2025-07-11Text
3Grok 3xAI
79.9
Unknown (multi-trillion estimated)2025-02-19Multimodal
4Gemini 2.0 ProGoogle
79.1
2025-02-05Multimodal
5Grok 3 MinixAI
78.9
Unknown2025-02-19Multimodal
6Gemini 2.0 FlashGoogle
77.6
2025-02-25Multimodal
7DeepSeek-V3DeepSeek
75.9
671B total, 37B activated2024-12-26Text
8Grok-2xAI
75.5
Unknown2024-08-13Multimodal
9Grok-2 minixAI
72
Unknown2024-08-13Multimodal
10Gemini 2.0 Flash-LiteGoogle
71.6
2025-02-25Multimodal

About MMLU-Pro

Methodology

MMLU-Pro evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2025.Technical Paper

Related Benchmarks