MoonshotAI: Kimi K2 Thinking

MoonshotAI: Kimi K2 Thinking

moonshotai · Released Nov 6, 2025
79
Our Score

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

$0.47 / 1M Input Price
$2.00 / 1M Output Price
131,072 tokens Context Window

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

40.9 Intelligence Index
34.8 Coding Index
62.1 Agentic Index
94.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 83.8%
Graduate-level scientific reasoning
HLE 22.3%
Humanity's Last Exam
MMLU Pro 84.8%
Multi-task language understanding
LiveCodeBench 85.3%
Live coding evaluation
SciCode 42.4%
Scientific computing
AIME 2025 94.7%
Competition mathematics (2025)
IFBench 68.1%
Instruction following
LCR 66.3%
Long-context reasoning
TerminalBench Hard 31.1%
Agentic terminal tasks
τ²-Bench 93%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID moonshotai/kimi-k2-thinking
Providermoonshotai
Release Date November 6, 2025
Context Length131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.47 $0.000470
Output $2.00 $0.002000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
371ms
Best Latency (TTFT)
151.5 tok/s
Best Throughput
3/8
Active Endpoints
Available via: DeepInfra, SiliconFlow, Moonshot AI, Nebius, Novita, Google, AtlasCloud

Leaderboard Categories