Cybersecurity CTF

securityPending Verification

Evaluates models on their ability to solve cybersecurity challenges across various domains including cryptography, web exploitation, reverse engineering, binary exploitation, and forensics.

Published: 2024
Score Range: 0-100
Top Score: 43

Cybersecurity CTF Leaderboard

RankModelProviderScoreParametersReleasedType
1o1-previewOpenAI
43
2024-09-12Text
2o1-miniOpenAI
28.7
2024-09-12Text

About Cybersecurity CTF

Methodology

Cybersecurity CTF evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.

Publication

This benchmark was published in 2024.Read the full paper