
Qwen-3
Third-generation Qwen model featuring hybrid reasoning capabilities that can switch between thinking and non-thinking modes. Trained on 36 trillion tokens (double that of Qwen2.5), with support for 119 languages and dialects. Available in 6 dense models (0.6B to 32B parameters) and 2 MoE models (30B/3B active and 235B/22B active).
Specifications
- Parameters
- 235B (22B active)
- Architecture
- Mixture of Experts
- License
- Apache 2.0
- Context Window
- 38,000 tokens
- Training Data Cutoff
- April 2025
- Type
- text
- Modalities
- text
Benchmark Scores
Advanced competitive programming benchmark for evaluating large language models on algorithmic probl...
A comprehensive code editing benchmark based on Exercism coding exercises across 6 programming langu...
A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...
The first comprehensive evaluation of LLMs' function calling capabilities, testing various forms inc...
Multi-IF evaluates LLMs on multi-turn and multilingual instruction following across 8 languages, wit...
Advanced Specifications
- Model Family
- Qwen
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Hardware Support
- CUDATPU
Capabilities & Limitations
- Capabilities
- hybrid reasoningthinking modenon-thinking modemathematicscodinglogical reasoningfunction-callingtool usecreative writing
- Notable Use Cases
- agent-based applicationscomplex reasoning tasksmultilingual applicationsAlibaba's Quark AI assistant
- Function Calling Support
- Yes
- Tool Use Support
- Yes