Moonshot AI logo

Kimi K2

Moonshot AIOpen WeightsVerified

Moonshot's latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. Achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. Meticulously optimized for agentic tasks with sophisticated tool-use capabilities and multi-turn interactions.

2025-07-11
1T total, 32B activated
Mixture of Experts (MoE)
Open Source

Specifications

Parameters
1T total, 32B activated
Architecture
Mixture of Experts (MoE)
License
Open Source
Context Window
128,000 tokens
Training Data Cutoff
2025-04
Type
text
Modalities
text

Benchmark Scores

A comprehensive code editing benchmark based on Exercism coding exercises across 6 programming langu...

American Invitational Mathematics Examination (AIME) 2024 problems....

American Invitational Mathematics Examination (AIME) 2025 problems....

A benchmark for measuring browsing agents' ability to navigate the web and find hard-to-find, entang...

Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....

HMMT89.3

Harvard-MIT Mathematics Tournament (HMMT) problems test advanced high school mathematical problem-so...

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

A challenging benchmark of novel problems designed to test the limits of AI capabilities....

A large-scale benchmark of 400 International Mathematical Olympiad-level problems with verifiable an...

A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...

Benchmark for evaluating LLMs on code generation tasks from contests....

MATH97.4

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

MMLU89.5

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

A benchmark of simple but precise questions to test factual knowledge and reasoning....

Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....

Evaluates models on their ability to use terminal commands to solve system administration tasks....

view all (+19)

Advanced Specifications

Model Family
Kimi
API Access
Available
Chat Interface
Available
Multilingual Support
Yes
Variants
Kimi-K2-BaseKimi-K2-Instruct
Hardware Support
CUDAvLLMSGLangKTransformersTensorRT-LLM

Capabilities & Limitations

Capabilities
reasoningcodemathagentictool-usemultilinguallong-contextonline-searchdeep-thinking
Known Limitations
excessive token generation on hard reasoningperformance decline with tool use on certain tasksincomplete tool calls with unclear definitions
Notable Use Cases
agentic applicationscoding assistantdata analysisweb developmentresearch tasks
Function Calling Support
Yes
Tool Use Support
Yes

Related Models