Z.ai: GLM 4.6

Z.ai: GLM 4.6

z-ai · Released Sep 30, 2025
65
Our Score

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks.
Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages.
Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability.
More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks.
Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

$0.39 / 1M Input Price
$1.90 / 1M Output Price
204,800 tokens Context Window
204,800 tokens Max Output

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

32.5 Intelligence Index
29.5 Coding Index
47.8 Agentic Index
86 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 78%
Graduate-level scientific reasoning
HLE 13.3%
Humanity's Last Exam
MMLU Pro 82.9%
Multi-task language understanding
LiveCodeBench 69.5%
Live coding evaluation
SciCode 38.4%
Scientific computing
AIME 2025 86%
Competition mathematics (2025)
IFBench 43.4%
Instruction following
LCR 54.3%
Long-context reasoning
TerminalBench Hard 25%
Agentic terminal tasks
τ²-Bench 70.5%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.6
Providerz-ai
Release Date September 30, 2025
Context Length204,800 tokens
Max Completion204,800 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.39 $0.000390
Output $1.90 $0.001900

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.9%
Avg Uptime
326ms
Best Latency (TTFT)
128 tok/s
Best Throughput
6/6
Active Endpoints
Available via: SiliconFlow, DeepInfra, AtlasCloud, Novita, BaseTen, Z.AI

Leaderboard Categories