
DeepSeek-R1
DeepSeek-R1 is a first-generation reasoning model trained via large-scale reinforcement learning. Built on DeepSeek-V3-Base, it incorporates cold-start data before RL to address challenges like endless repetition and poor readability found in DeepSeek-R1-Zero. Achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks through advanced chain-of-thought reasoning capabilities.
Specifications
- Parameters
- 671B (37B activated)
- Architecture
- Mixture of Experts
- License
- MIT
- Context Window
- 128,000 tokens
- Max Output
- 32,768 tokens
- Type
- text
- Modalities
- text
Benchmark Scores
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
Discrete Reasoning Over Paragraphs (DROP) requires models to resolve references in a passage and per...
Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...
A benchmark of simple but precise questions to test factual knowledge and reasoning....
Evaluates models on competitive programming problems from the Codeforces platform....
Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...
Tests models on their ability to write code in multiple programming languages....
American Invitational Mathematics Examination (AIME) 2024 problems....
A sample of 500 diverse problems from the MATH benchmark, spanning topics like probability, algebra,...
Advanced Specifications
- Model Family
- R1 series
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Variants
- DeepSeek-R1-Zero (RL without SFT)DeepSeek-R1-Distill-Qwen-1.5BDeepSeek-R1-Distill-Qwen-7BDeepSeek-R1-Distill-Qwen-14BDeepSeek-R1-Distill-Qwen-32BDeepSeek-R1-Distill-Llama-8BDeepSeek-R1-Distill-Llama-70B
Capabilities & Limitations
- Capabilities
- reasoningchain-of-thoughtself-verificationreflectioncodemathsciencelong-form reasoning
- Known Limitations
- May bypass thinking pattern for certain queriesRequires temperature 0.5-0.7 to prevent endless repetitionsNo system prompt support recommended
- Notable Use Cases
- complex mathematical reasoningadvanced coding tasksscientific problem solvingstep-by-step reasoningcompetitive programming
- Function Calling Support
- No
- Tool Use Support
- No