OpenAI logo

GPT-4o

OpenAIProprietaryPending Verification

GPT-4o ('o' for 'omni') is a multimodal model that accepts any combination of text, audio, image, and video inputs and generates text, audio, and image outputs. It matches GPT-4 Turbo performance on English text and code, with significant improvements for non-English languages. It responds to audio inputs in as little as 232 milliseconds (avg. 320ms), similar to human conversation response time. GPT-4o is 50% cheaper in the API than previous models and features superior vision and audio understanding capabilities.

2024-05-13
Decoder-only Transformer (with vision encoder for images and audio processing)
Proprietary

Specifications

Architecture
Decoder-only Transformer (with vision encoder for images and audio processing)
License
Proprietary
Context Window
128,000 tokens
Max Output
16,384 tokens
Training Data Cutoff
Sep 30, 2023
Type
multimodal
Modalities
textvisionaudiovideo

Benchmark Scores

Codeforces900

Evaluates models on competitive programming problems from the Codeforces platform....

HumanEval90.2

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

Cybersecurity CTF20

Evaluates models on their ability to solve cybersecurity challenges across various domains including...

MMLU88.7

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

GPQA53.6

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

MATH76.6

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

MATH 50060.3

A sample of 500 diverse problems from the MATH benchmark, spanning topics like probability, algebra,...

MGSM90.5

Multilingual Grade School Math (MGSM) extends GSM8K to 10 languages....

DROP83.4

Discrete Reasoning Over Paragraphs (DROP) requires models to resolve references in a passage and per...

See all benchmarks

Advanced Specifications

Model Family
omni
API Access
Not Available
Chat Interface
Not Available

Capabilities & Limitations

Related Models