Alibaba logo

Qwen-2

AlibabaOpen SourceVerified

Second-generation Qwen model (72B parameters) open-sourced under Alibaba's Qianwen License. Trained on 3T tokens; released alongside smaller variants and intended as foundation for Alibaba's enterprise AI services.

2024-06-11
72B
Decoder-only Transformer with Group Query Attention
Qianwen (open) / Apache 2.0

Specifications

Parameters
72B
Architecture
Decoder-only Transformer with Group Query Attention
License
Qianwen (open) / Apache 2.0
Context Window
128,000 tokens
Type
text
Modalities
text

Benchmark Scores

MMLU84.2

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

GPQA37.9

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

GSM8K89.5

Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....

MATH51.1

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

Advanced Specifications

Model Family
Qwen
API Access
Not Available
Chat Interface
Not Available
Multilingual Support
Yes

Capabilities & Limitations

Capabilities
language understandinglanguage generationmultilingualcodingmathematicsreasoning

Related Models