Cybersecurity CTF

securityPending Verification

Evaluates models on their ability to solve cybersecurity challenges across various domains including cryptography, web exploitation, reverse engineering, binary exploitation, and forensics.

Published: 2024
Score Range: 0-100
Top Score: 43

Cybersecurity CTF Leaderboard

RankModelProviderScoreParametersReleasedType
1o1-previewOpenAI
43
2024-09-12Text
2o1-miniOpenAI
28.7
2024-09-12Text

About Cybersecurity CTF

Methodology

Cybersecurity CTF evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2024.Technical Paper