Cybersecurity CTF
securityPending Verification
Evaluates models on their ability to solve cybersecurity challenges across various domains including cryptography, web exploitation, reverse engineering, binary exploitation, and forensics.
Published: 2024
Score Range: 0-100
Top Score: 43
Cybersecurity CTF Leaderboard
Rank | Model | Provider | Score | Parameters | Released | Type |
---|---|---|---|---|---|---|
1 | o1-preview | OpenAI | 43 | 2024-09-12 | Text | |
2 | o1-mini | OpenAI | 28.7 | 2024-09-12 | Text |
About Cybersecurity CTF
Methodology
Cybersecurity CTF evaluates model performance using a standardized scoring methodology. Scores are reported on a scale of 0 to 100, where higher scores indicate better performance. For detailed information about the scoring system and methodology, please refer to the original paper.
Publication
This benchmark was published in 2024.Technical Paper