Kimi K2
Moonshot's latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. Achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. Meticulously optimized for agentic tasks with sophisticated tool-use capabilities and multi-turn interactions.
Specifications
- Parameters
- 1T total, 32B activated
- Architecture
- Mixture of Experts (MoE)
- License
- Open Source
- Context Window
- 128,000 tokens
- Training Data Cutoff
- 2025-04
- Type
- text
- Modalities
- text
Benchmark Scores
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
A contamination-limited benchmark with frequently-updated questions from recent sources, scoring ans...
Advanced Specifications
- Model Family
- Kimi
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Variants
- Kimi-K2-BaseKimi-K2-Instruct
- Hardware Support
- CUDAvLLMSGLangKTransformersTensorRT-LLM
Capabilities & Limitations
- Capabilities
- reasoningcodemathagentictool-usemultilinguallong-contextonline-searchdeep-thinking
- Known Limitations
- excessive token generation on hard reasoningperformance decline with tool use on certain tasksincomplete tool calls with unclear definitions
- Notable Use Cases
- agentic applicationscoding assistantdata analysisweb developmentresearch tasks
- Function Calling Support
- Yes
- Tool Use Support
- Yes