Global-MMLU
A multilingual evaluation set spanning 42 languages that combines machine translations for MMLU questions along with professional translations and crowd-sourced post-edits. Includes cultural sensitivity annotations classifying questions as Culturally Sensitive (CS) or Culturally Agnostic (CA).
Global-MMLU Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | Gemini 2.5 Pro | 88.6 | 2025-05-06 | Multimodal | ||
2 | Gemma 3 | 75.4 | 1B, 4B, 12B, 27B | 2025-03-12 | Multimodal |
About Global-MMLU
Description
A multilingual evaluation set spanning 42 languages that combines machine translations for MMLU questions along with professional translations and crowd-sourced post-edits. Includes cultural sensitivity annotations classifying questions as Culturally Sensitive (CS) or Culturally Agnostic (CA).
Methodology
Global-MMLU evaluates models on a scale of 0 to 100. Higher scores indicate better performance. For detailed information about the methodology, please refer to the original paper.
Publication
This benchmark was published in 2025.Read the full paper