Scale-MultiChallenge

diversePending Verification

A multi-domain challenge set created by Scale AI to test models across diverse tasks.

Published: 2024
Score Range: 0-100
Top Score: 56.51

Scale-MultiChallenge Leaderboard

RankModelProviderScoreParametersReleasedType
1o3OpenAI
56.51
2025-04-16Multimodal
2o4-miniOpenAI
42.99
2025-04-16Multimodal

About Scale-MultiChallenge

Methodology

Scale-MultiChallenge evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2024.Read the full paper

Related Benchmarks