GPT-OSS-120B

Name: GPT-OSS-120B
Author: OpenAI

OpenAIOpen WeightsVerified

GPT-OSS-120B is a state-of-the-art open-weight language model that delivers strong real-world performance at low cost. This 120 billion parameter mixture-of-experts model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks while running efficiently on a single 80 GB GPU. It was trained using reinforcement learning and techniques informed by OpenAI's most advanced internal models, including o3 and other frontier systems.

2025-08-05

117B total (5.1B active per token)

Mixture of Experts Transformer

Apache-2.0

Compare with other models

Specifications

Parameters: 117B total (5.1B active per token)
Architecture: Mixture of Experts Transformer
License: Apache-2.0
Context Window: 128,000 tokens
Type: text
Modalities: text

Benchmark Scores

MMLU90

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

GPQA80.1

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

Humanitys-Last-Exam19

A challenging benchmark of novel problems designed to test the limits of AI capabilities....

AIME-202496.6

American Invitational Mathematics Examination (AIME) 2024 problems....

AIME-202597.9

American Invitational Mathematics Examination (AIME) 2025 problems....

CodeForces2,622

Advanced competitive programming benchmark for evaluating large language models on algorithmic probl...

TAU-bench67.8

Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....

Advanced Specifications

Model Family: gpt-oss
API Access: Not Available
Chat Interface: Not Available
Variants: MXFP4 quantized
Hardware Support: NVIDIAAMDCerebrasGroqONNX Runtime

Capabilities & Limitations

Capabilities: chain-of-thought reasoningtool useweb searchpython code executionfunction callingstructured outputsvariable reasoning effortagentic workflowscompetition mathematicscodingscientific reasoninglow-latency inference
Known Limitations: Chain-of-thought may contain hallucinated or harmful contentCoT should not be directly shown to usersMay include language that doesn't reflect OpenAI's safety policiesRequires safety monitoring for CoT content
Notable Use Cases: Agentic workflowsCompetition mathematicsScientific researchCode generation and debuggingComplex reasoning tasksTool-assisted problem solvingOn-premises deploymentCustom fine-tuningResearch and development
Function Calling Support: Yes
Tool Use Support: Yes

Resources

Related Models

GPT-OSS-20B

OpenAI

GPT-OSS-20B is a medium-sized open-weight language model that delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory. This makes it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure. Despite its smaller size, it demonstrates strong performance on reasoning tasks, tool use, and competition mathematics.

Typetext

Parameters21B total (3.6B active per token)

2025-08-05

Open Weights

Details Compare

o3

OpenAI

OpenAI's most powerful reasoning model that pushes the frontier across coding, math, science, visual perception, and more. Sets new state-of-the-art on benchmarks including Codeforces, SWE-bench, and MMMU. Features full tool access and can agentically use and combine every tool within ChatGPT.

o4-mini

OpenAI

A smaller model optimized for fast, cost-efficient reasoning. Achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks. It is the best-performing benchmarked model on AIME 2024 and 2025, with significantly higher usage limits than o3.