Z.ai: GLM 4.7 Flash

Z.ai: GLM 4.7 Flash

z-ai · Released Jan 19, 2026
38
Our Score

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

$0.06 / 1M Input Price
$0.40 / 1M Output Price
202,752 tokens Context Window

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

30.1 Intelligence Index
25.9 Coding Index
60.4 Agentic Index

Benchmark Scores

Evaluations

GPQA Diamond 58.1%
Graduate-level scientific reasoning
HLE 7.1%
Humanity's Last Exam
SciCode 33.7%
Scientific computing
IFBench 60.8%
Instruction following
LCR 35%
Long-context reasoning
TerminalBench Hard 22%
Agentic terminal tasks
τ²-Bench 98.8%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.7-flash
Providerz-ai
Release Date January 19, 2026
Context Length202,752 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.06 $0.000060
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

94.5%
Avg Uptime
551ms
Best Latency (TTFT)
74 tok/s
Best Throughput
5/5
Active Endpoints
Available via: DeepInfra, Z.AI, Novita, Phala, Venice

Leaderboard Categories