Codeforces
coding
Evaluates models on competitive programming problems from the Codeforces platform.
Published: 2023
Scale: 0-3000
Top Score: 2,719
Codeforces Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | o4-mini | OpenAI | 2,719 | 2025-04-16 | Multimodal | |
2 | o3 | OpenAI | 2,706 | 2025-04-16 | Multimodal | |
3 | Qwen-3 | Alibaba | 2,056 | 235B (22B active) | 2025-04-29 | Text |
4 | DeepSeek-R1 | DeepSeek | 2,029 | 671B (37B activated) | 2025-01-20 | Text |
5 | o1 | OpenAI | 1,673 | 2024-09-12 | Multimodal | |
6 | o1-mini | OpenAI | 1,650 | 2024-09-12 | Text | |
7 | o1-preview | OpenAI | 1,258 | 2024-09-12 | Text | |
8 | GPT-4o | OpenAI | 900 | 2024-05-13 | Multimodal | |
9 | DeepSeek-V3 | DeepSeek | 51.6 | 671B total, 37B activated | 2024-12-26 | Text |
About Codeforces
Description
Evaluates models on competitive programming problems from the Codeforces platform.
Methodology
Codeforces evaluates models on a scale of 0 to 3000. Higher scores indicate better performance. For detailed information about the methodology, please refer to the original paper.
Publication
This benchmark was published in 2023.Read the full paper