Z.ai: GLM 4.6 (exacto)

Z.ai: GLM 4.6 (exacto)

z-ai · Released Sep 30, 2025
28
Our Score

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks.
Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages.
Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability.
More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks.
Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

$0.44 / 1M Input Price
$1.76 / 1M Output Price
204,800 tokens Context Window
131,072 tokens Max Output

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Model Information

OpenRouter ID z-ai/glm-4.6:exacto
Providerz-ai
Release Date September 30, 2025
Context Length204,800 tokens
Max Completion131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.44 $0.000440
Output $1.76 $0.001760

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.8%
Avg Uptime
355ms
Best Latency (TTFT)
151 tok/s
Best Throughput
6/6
Active Endpoints
Available via: SiliconFlow, DeepInfra, AtlasCloud, Novita, BaseTen, Z.AI