Mixtral 8×22B
A sparse Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of 141B total, offering unparalleled cost efficiency for its size. Sets a new standard for performance and efficiency within the AI community.
2024-04-17
141B (39B active)
Sparse Mixture-of-Experts (SMoE)
Apache 2.0
Specifications
- Parameters
- 141B (39B active)
- Architecture
- Sparse Mixture-of-Experts (SMoE)
- License
- Apache 2.0
- Context Window
- 65,536 tokens
- Type
- text
- Modalities
- text
Benchmark Scores
MMLU77.8
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
GSM8K88
Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
MATH42.5
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
FACTS Grounding70.1
The FACTS Grounding Leaderboard evaluates LLMs' ability to generate factually accurate long-form res...
Advanced Specifications
- Model Family
- Mixtral
- API Access
- Not Available
- Chat Interface
- Not Available
- Multilingual Support
- Yes
Capabilities & Limitations
- Capabilities
- reasoningmathcodingmultilingualfunction-calling
- Notable Use Cases
- application developmenttech stack modernisationlarge document analysismultilingual tasks
- Function Calling Support
- Yes