HiddenMath

mathVerified

Google's internal holdout set of competition math problems

Published: 2025
Score Range: 0-100
Top Score: 65.2

HiddenMath Leaderboard

RankModelProviderScoreParametersReleasedType
1Gemini 2.0 ProGoogle
65.2
2025-02-05Multimodal
2Gemini 2.0 FlashGoogle
63.5
2025-02-25Multimodal
3Gemini 2.0 Flash-LiteGoogle
55.3
2025-02-25Multimodal

About HiddenMath

Methodology

HiddenMath evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2025.Technical Paper