MMLU-Pro

knowledgeVerified

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including Biology, Business, Chemistry, Computer Science, Economics, Engineering, Health, History, Law, Math, Philosophy, Physics, Psychology, and Others. It features 10 answer choices per question (vs. 4 in MMLU) and focuses on complex reasoning tasks.

Published: 2025
Score Range: 0-100
Top Score: 90.1

MMLU-Pro Leaderboard

RankModelProviderScoreParametersReleasedType
1Gemini 3 ProGoogle
90.1
Proprietary2025-11-18Multimodal
2GLM-4.7Z.ai
87.5
Unreleased2025-12-22Text
3Kimi K2Moonshot AI
84.6
1T total, 32B activated2025-07-11Text
4DeepSeek-R1DeepSeek
84
671B (37B activated)2025-01-20Text
5Grok 3xAI
79.9
Unknown (multi-trillion estimated)2025-02-19Multimodal
6Gemini 2.0 ProGoogle
79.1
2025-02-05Multimodal
7Grok 3 MinixAI
78.9
Unknown2025-02-19Multimodal
8Nemotron 3 NanoNVIDIA
78.3
31.6B (Total), ~3.2B (Active)2025-12-15Text
9Gemini 2.0 FlashGoogle
77.6
2025-02-25Multimodal
10DeepSeek-V3DeepSeek
75.9
671B total, 37B activated2024-12-26Text

About MMLU-Pro

Methodology

MMLU-Pro evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2025.Technical Paper

Related Benchmarks