
Qwen-3
Third-generation Qwen model featuring hybrid reasoning capabilities that can switch between thinking and non-thinking modes. Trained on 36 trillion tokens (double that of Qwen2.5), with support for 119 languages and dialects. Available in 6 dense models (0.6B to 32B parameters) and 2 MoE models (30B/3B active and 235B/22B active).
2025-04-29
235B (22B active)
Mixture of Experts
Apache 2.0
Specifications
- Parameters
- 235B (22B active)
- Architecture
- Mixture of Experts
- License
- Apache 2.0
- Context Window
- 38,000 tokens
- Training Data Cutoff
- April 2025
- Type
- text
- Modalities
- text
Benchmark Scores
CodeForces2,056
Advanced competitive programming benchmark for evaluating large language models on algorithmic probl...
LiveBench77.1
A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...
The first comprehensive evaluation of LLMs' function calling capabilities, testing various forms inc...
Multi-IF71.9
Multi-IF evaluates LLMs on multi-turn and multilingual instruction following across 8 languages, wit...
view all (+7)
Advanced Specifications
- Model Family
- Qwen
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Hardware Support
- CUDATPU
Capabilities & Limitations
- Capabilities
- hybrid reasoningthinking modenon-thinking modemathematicscodinglogical reasoningfunction-callingtool usecreative writing
- Notable Use Cases
- agent-based applicationscomplex reasoning tasksmultilingual applicationsAlibaba's Quark AI assistant
- Function Calling Support
- Yes
- Tool Use Support
- Yes