Mistral AI logo

Mixtral 8×7B

Mistral AIOpen SourcePending Verification

A sparse Mixture-of-Experts model (8 experts) with a total of 46.7B parameters (12.9B active per token). Outperformed dense models like Llama 2 70B and GPT-3.5 on many benchmarks at significantly lower compute cost.

2023-12-01
46.7B (8×7B)
Mixture-of-Experts Transformer
Apache 2.0

Specifications

Parameters
46.7B (8×7B)
Architecture
Mixture-of-Experts Transformer
License
Apache 2.0
Context Window
4,096 tokens
Type
text
Modalities
text

Benchmark Scores

Advanced Specifications

Model Family
Mixtral
API Access
Not Available
Chat Interface
Not Available

Capabilities & Limitations

Related Models