OpenAI logo

GPT-OSS-20B

OpenAIOpen WeightsVerified

GPT-OSS-20B is a medium-sized open-weight language model that delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory. This makes it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure. Despite its smaller size, it demonstrates strong performance on reasoning tasks, tool use, and competition mathematics.

2025-08-05
21B total (3.6B active per token)
Mixture of Experts Transformer
Apache-2.0

Specifications

Parameters
21B total (3.6B active per token)
Architecture
Mixture of Experts Transformer
License
Apache-2.0
Context Window
128,000 tokens
Type
text
Modalities
text

Benchmark Scores

MMLU85.3

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

GPQA71.5

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

A challenging benchmark of novel problems designed to test the limits of AI capabilities....

American Invitational Mathematics Examination (AIME) 2024 problems....

American Invitational Mathematics Examination (AIME) 2025 problems....

Advanced competitive programming benchmark for evaluating large language models on algorithmic probl...

Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....

Advanced Specifications

Model Family
gpt-oss
API Access
Not Available
Chat Interface
Not Available
Variants
MXFP4 quantized
Hardware Support
NVIDIAAMDCerebrasGroqONNX Runtime

Capabilities & Limitations

Capabilities
chain-of-thought reasoningtool useweb searchpython code executionfunction callingstructured outputsvariable reasoning effortagentic workflowscompetition mathematicscodingscientific reasoningedge device deploymentlow-memory inference
Known Limitations
Chain-of-thought may contain hallucinated or harmful contentCoT should not be directly shown to usersMay include language that doesn't reflect OpenAI's safety policiesRequires safety monitoring for CoT content
Notable Use Cases
Edge device deploymentLocal inferenceOn-device applicationsRapid prototypingResource-constrained environmentsCompetition mathematicsCode generationAgentic workflowsCustom fine-tuningResearch and development
Function Calling Support
Yes
Tool Use Support
Yes

Related Models