
Qwen-2
Second-generation Qwen model (72B parameters) open-sourced under Alibaba's Qianwen License. Trained on 3T tokens; released alongside smaller variants and intended as foundation for Alibaba's enterprise AI services.
Specifications
- Parameters
- 72B
- Architecture
- Decoder-only Transformer with Group Query Attention
- License
- Qianwen (open) / Apache 2.0
- Context Window
- 128,000 tokens
- Type
- text
- Modalities
- text
Benchmark Scores
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...
Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
Advanced Specifications
- Model Family
- Qwen
- API Access
- Not Available
- Chat Interface
- Not Available
- Multilingual Support
- Yes
Capabilities & Limitations
- Capabilities
- language understandinglanguage generationmultilingualcodingmathematicsreasoning