Evaluates models on their ability to solve cybersecurity challenges across various domains including cryptography, web exploitation, reverse engineering, binary exploitation, and forensics.

Cybersecurity CTF

Large reasoning model with strong capabilities for solving hard problems through extended thinking. Designed to spend more time reasoning before responding, with enhanced performance in science, coding, and math.

o1-preview

Faster, cost-efficient reasoning model optimized for coding and agentic applications. 80% cheaper than o1-preview with strong capabilities for complex problem-solving in tasks where context is provided within the prompt.

Rank	Model	Provider	Score	Parameters	Released	Type
1	o1-preview	OpenAI	43		2024-09-12	Text
2	o1-mini	OpenAI	28.7		2024-09-12	Text

Cybersecurity CTF

Cybersecurity CTF Leaderboard

About Cybersecurity CTF

Methodology

Publication