Alibaba logo

Qwen3 235B (MoE)

AlibabaOpen WeightsPending Human Review

The flagship model of the Qwen 3 family, featuring a Mixture-of-Experts architecture with 235B total parameters (22B active). It introduces 'Hybrid Reasoning' capabilities, allowing it to switch between standard generation and a deep 'Thinking Mode' for complex logic. Trained on 36 trillion tokens, with support for 119 languages and dialects. Available in 6 dense models (0.6B to 32B parameters) and 2 MoE models (30B/3B active and 235B/22B active).

2025-04-29
235B (22B active)
Mixture of Experts (MoE)
Apache 2.0

Specifications

Parameters
235B (22B active)
Architecture
Mixture of Experts (MoE)
License
Apache 2.0
Context Window
128,000 tokens
Max Output
16,384 tokens
Training Data Cutoff
Apr 2025
Type
text
Modalities
text

Benchmark Scores

Advanced Specifications

Model Family
Qwen
API Access
Available
Chat Interface
Available
Multilingual Support
Yes
Variants
Qwen3-30B-A3B (MoE)Qwen3-32B (Dense)Qwen3-14B (Dense)Qwen3-8B (Dense)Qwen3-4B (Dense)Qwen3-1.7B (Dense)Qwen3-0.6B (Dense)
Hardware Support
CUDATPU

Capabilities & Limitations

Capabilities
hybrid reasoningthinking modenon-thinking modemathematicscodinglogical reasoningfunction-callingtool usecreative writingmultilingual
Notable Use Cases
agent-based applicationscomplex reasoning tasksmultilingual applicationsAlibaba's Quark AI assistantComplex reasoningScientific researchAgentic workflows
Function Calling Support
Yes
Tool Use Support
Yes

Related Models