OpenAI logo

o3

OpenAIProprietaryVerified

OpenAI's most powerful reasoning model that pushes the frontier across coding, math, science, visual perception, and more. Sets new state-of-the-art on benchmarks including Codeforces, SWE-bench, and MMMU. Features full tool access and can agentically use and combine every tool within ChatGPT.

2025-04-16
Decoder-only Transformer with reinforcement learning
Proprietary

Specifications

Architecture
Decoder-only Transformer with reinforcement learning
License
Proprietary
Type
multimodal
Modalities
textvision

Benchmark Scores

AIME91.6

American Invitational Mathematics Examination (AIME) problems test advanced mathematical problem-sol...

GPQA83.3

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...

MMMU82.9

A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI with 11.5...

Evaluates mathematical reasoning in visual contexts, combining vision and mathematical problem-solvi...

Tests reasoning on challenging problems from arXiv papers across multiple scientific domains....

Advanced competitive programming benchmark for evaluating large language models on algorithmic probl...

A benchmark of over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD...

Tests models on their ability to write code in multiple programming languages....

A multi-domain challenge set created by Scale AI to test models across diverse tasks....

A benchmark for measuring browsing agents' ability to navigate the web and find hard-to-find, entang...

Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....

Testing long-term coherence in agents by simulating a vending machine business. Agents manage orderi...

Advanced Specifications

Model Family
o-series
API Access
Available
Chat Interface
Available
Multilingual Support
Yes

Capabilities & Limitations

Capabilities
codemathreasoningvisual perceptiontool useagentic
Known Limitations
May make errors in complex reasoning tasks
Function Calling Support
Yes
Tool Use Support
Yes

Related Models