
Qwen-3
Third-generation Qwen model featuring hybrid reasoning capabilities that can switch between thinking and non-thinking modes. Trained on 36 trillion tokens (double that of Qwen2.5), with support for 119 languages and dialects. Available in 6 dense models (0.6B to 32B parameters) and 2 MoE models (30B/3B active and 235B/22B active).
Specifications
- Parameters
- 235B (22B active)
- Architecture
- Mixture of Experts
- License
- Apache 2.0
- Context Window
- 38,000 tokens
- Training Data Cutoff
- April 2025
- Type
- text
- Modalities
- text
Benchmark Scores
American Invitational Mathematics Examination (AIME) 2024 problems....
American Invitational Mathematics Examination (AIME) 2025 problems....
Evaluates models on their ability to solve coding problems in real-time....
Evaluates models on competitive programming problems from the Codeforces platform....
Tests models on their ability to write code in multiple programming languages....
A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...
The first comprehensive evaluation of LLMs' function calling capabilities, testing various forms inc...
Multi-IF evaluates LLMs on multi-turn and multilingual instruction following across 8 languages, wit...
Advanced Specifications
- Model Family
- Qwen
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Hardware Support
- CUDATPU
Capabilities & Limitations
- Capabilities
- hybrid reasoningthinking modenon-thinking modemathematicscodinglogical reasoningfunction-callingtool usecreative writing
- Notable Use Cases
- agent-based applicationscomplex reasoning tasksmultilingual applicationsAlibaba's Quark AI assistant
- Function Calling Support
- Yes
- Tool Use Support
- Yes