European AI company developing state-of-the-art language models with a focus on efficiency.
Mistral AI
Flagship 123B-parameter model from Mistral AI, progressively upgraded through 2024 versions. Targeted to be a top-performing open model competing with Meta's Llama family on knowledge and reasoning tasks.
A 123B-parameter **multimodal** model from Mistral, combining language with vision ("Pix-"). A 12B smaller version was released under Apache 2.0. Pixtral extends Mistral's LLM capability to image understanding and description tasks.
The smallest model in Mistral AI's Ministral family, part of 'les Ministraux'. A state-of-the-art edge model optimized for on-device computing with exceptional knowledge, commonsense, reasoning, and function-calling capabilities in the sub-10B category.
A state-of-the-art 8B parameter model from Mistral AI's Ministral family, part of 'les Ministraux'. Features special interleaved sliding-window attention pattern for faster and memory-efficient inference, designed for edge computing and on-device applications.
Mistral AI's first-ever code model. An open-weight generative AI model explicitly designed for code generation tasks. Trained on a diverse dataset of 80+ programming languages including Python, Java, C, C++, JavaScript, Bash, Swift, and Fortran. Helps developers write and interact with code through shared instruction and completion API endpoints.
A sparse Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of 141B total, offering unparalleled cost efficiency for its size. Sets a new standard for performance and efficiency within the AI community.
A sparse Mixture-of-Experts model (8 experts) with a total of 46.7B parameters (12.9B active per token). Outperformed dense models like Llama 2 70B and GPT-3.5 on many benchmarks at significantly lower compute cost.
7.3B-parameter model trained from scratch with improved data and architecture enhancements. Open-source (Apache 2.0) with strong performance that rivaled larger 13B+ models on many benchmarks.