DeepSeek logo

DeepSeek-R1

DeepSeekOpen SourceVerified

DeepSeek-R1 is a first-generation reasoning model trained via large-scale reinforcement learning. Built on DeepSeek-V3-Base, it incorporates cold-start data before RL to address challenges like endless repetition and poor readability found in DeepSeek-R1-Zero. Achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks through advanced chain-of-thought reasoning capabilities.

2025-01-20
671B (37B activated)
Mixture of Experts
MIT

Specifications

Parameters
671B (37B activated)
Architecture
Mixture of Experts
License
MIT
Context Window
128,000 tokens
Max Output
32,768 tokens
Type
text
Modalities
text

Benchmark Scores

MMLU90.8

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

mmlu-redux92.9
MMLU-Pro84

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

DROP92.2

Discrete Reasoning Over Paragraphs (DROP) requires models to resolve references in a passage and per...

if-eval83.3
GPQA71.5

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

SimpleQA30.1

A benchmark of simple but precise questions to test factual knowledge and reasoning....

frames82.5
alpacaeval87.6
arenahard92.3
livecodebench65.9
Codeforces2,029

Evaluates models on competitive programming problems from the Codeforces platform....

SWE-bench49.2

Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...

Aider-Polyglot53.3

Tests models on their ability to write code in multiple programming languages....

AIME-202479.8

American Invitational Mathematics Examination (AIME) 2024 problems....

MATH 50097.3

A sample of 500 diverse problems from the MATH benchmark, spanning topics like probability, algebra,...

cnmo-202478.8
See all benchmarks

Advanced Specifications

Model Family
R1 series
API Access
Available
Chat Interface
Available
Multilingual Support
Yes
Variants
DeepSeek-R1-Zero (RL without SFT)DeepSeek-R1-Distill-Qwen-1.5BDeepSeek-R1-Distill-Qwen-7BDeepSeek-R1-Distill-Qwen-14BDeepSeek-R1-Distill-Qwen-32BDeepSeek-R1-Distill-Llama-8BDeepSeek-R1-Distill-Llama-70B

Capabilities & Limitations

Capabilities
reasoningchain-of-thoughtself-verificationreflectioncodemathsciencelong-form reasoning
Known Limitations
May bypass thinking pattern for certain queriesRequires temperature 0.5-0.7 to prevent endless repetitionsNo system prompt support recommended
Notable Use Cases
complex mathematical reasoningadvanced coding tasksscientific problem solvingstep-by-step reasoningcompetitive programming
Function Calling Support
No
Tool Use Support
No

Related Models