HiddenMath
mathVerified
Google's internal holdout set of competition math problems
Published: 2025
Score Range: 0-100
Top Score: 65.2
HiddenMath Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | Gemini 2.0 Pro | 65.2 | 2025-02-05 | Multimodal | ||
2 | Gemini 2.0 Flash | 63.5 | 2025-02-25 | Multimodal | ||
3 | Gemini 2.0 Flash-Lite | 55.3 | 2025-02-25 | Multimodal |
About HiddenMath
Methodology
HiddenMath evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.
Publication
This benchmark was published in 2025.Technical Paper