MoonshotAI: Kimi K2 0905

MoonshotAI: Kimi K2 0905

moonshotai · Released Sep 4, 2025
66
Our Score

Kimi K2 0905 is the September update of Kimi K2 0711. It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It supports long-context inference up to 256k tokens, extended from the previous 128k. This update improves agentic coding with higher accuracy and better generalization across scaffolds, and enhances frontend coding with more aesthetic and functional outputs for web, 3D, and related tasks. Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks. The model is trained with a novel stack incorporating the MuonClip optimizer for stable large-scale MoE training.

$0.40 / 1M Input Price
$2.00 / 1M Output Price
131,072 tokens Context Window

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

30.9 Intelligence Index
25.9 Coding Index
48.5 Agentic Index
57.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 76.7%
Graduate-level scientific reasoning
HLE 6.3%
Humanity's Last Exam
MMLU Pro 81.9%
Multi-task language understanding
LiveCodeBench 61%
Live coding evaluation
SciCode 30.7%
Scientific computing
AIME 2025 57.3%
Competition mathematics (2025)
IFBench 41.7%
Instruction following
LCR 52.3%
Long-context reasoning
TerminalBench Hard 23.5%
Agentic terminal tasks
τ²-Bench 73.4%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID moonshotai/kimi-k2-0905
Providermoonshotai
Release Date September 4, 2025
Context Length131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.40 $0.000400
Output $2.00 $0.002000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

98.5%
Avg Uptime
237ms
Best Latency (TTFT)
153 tok/s
Best Throughput
7/8
Active Endpoints
Available via: DeepInfra, SiliconFlow, Moonshot AI, Novita, Fireworks, AtlasCloud, Groq

Leaderboard Categories