Moonshot AI logo

Kimi K2

Moonshot AIOpen SourceVerified

Moonshot's latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. Achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. Meticulously optimized for agentic tasks with sophisticated tool-use capabilities and multi-turn interactions.

2025-07-11
1T total, 32B activated
Mixture of Experts (MoE)
Open Source

Specifications

Parameters
1T total, 32B activated
Architecture
Mixture of Experts (MoE)
License
Open Source
Context Window
128,000 tokens
Training Data Cutoff
2025-04
Type
text
Modalities
text

Benchmark Scores

MMLU89.5

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

MATH97.4

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

American Invitational Mathematics Examination (AIME) 2024 problems....

American Invitational Mathematics Examination (AIME) 2025 problems....

Tests models on their ability to write code in multiple programming languages....

A benchmark of simple but precise questions to test factual knowledge and reasoning....

A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...

view all (+22)

Advanced Specifications

Model Family
Kimi
API Access
Available
Chat Interface
Available
Multilingual Support
Yes
Variants
Kimi-K2-BaseKimi-K2-Instruct
Hardware Support
CUDAvLLMSGLangKTransformersTensorRT-LLM

Capabilities & Limitations

Capabilities
reasoningcodemathagentictool-usemultilinguallong-contextonline-searchdeep-thinking
Known Limitations
excessive token generation on hard reasoningperformance decline with tool use on certain tasksincomplete tool calls with unclear definitions
Notable Use Cases
agentic applicationscoding assistantdata analysisweb developmentresearch tasks
Function Calling Support
Yes
Tool Use Support
Yes

Related Models