Z.ai: GLM 4.7 Flash

Z.ai: GLM 4.7 Flash

z-ai · Released Jan 19, 2026 Professional
Intelligence #10 / 565
82.0 Our Score
Speed #149 / 262
85.5 tokens / sec
Input #159 / 565
$0.060 per 1M tokens
Output #220 / 565
$0.400 per 1M tokens
Context #185 / 565
202,752 tokens

Analysis Summary

Z.ai: GLM 4.7 Flash sits in the Professional tier on our leaderboard, ranked #10 of 565 published models on overall intelligence. At $0.060 input and $0.400 output per 1M tokens, it is among the most expensive on the market. It offers a generous context window for extended reasoning and code review and supports tool use and function calling.

Editorial notes

Z.ai GLM 4.7 Flash delivers a strong agentic index and near-perfect tau2 reliability at $0.06/$0.40 per million tokens, making it an exceptional value option for agentic workflows despite provider adoption considerations.

Assessed May 31, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence4.1Technical5.3Value8Content5.4
Intelligence 4.1/10
Technical 5.3/10
Content 5.4/10
Value 8/10

How Z.ai: GLM 4.7 Flash compares

Z.ai: GLM 4.7 Flash ranks #105 of 370 AI models we track for overall intelligence, #102 of 307 for coding, #55 of 282 for agentic tasks. Its 203K-token context window is larger than 67% of the models we list. At $0.06 per million input tokens it is cheaper than 72% of comparable models.

About Z.ai: GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,..

Capabilities

Tool Use Function Calling

Performance Indices

Source: Artificial Analysis

30.1 Intelligence Index
25.9 Coding Index
60.4 Agentic Index

Benchmark Scores

Intelligence

GPQA Diamond 58.1% Graduate-level scientific reasoning
HLE 7.1% Humanity's Last Exam
SciCode 33.7% Scientific computing

Technical

TerminalBench Hard 22% Agentic terminal tasks
τ²-Bench 98.8% Conversational agent benchmark

Content

IFBench 60.8% Instruction following
LCR 35% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Z.ai: GLM 4.7 Flash stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID z-ai/glm-4.7-flash
Providerz-ai
Release Date January 19, 2026
Context Length202,752 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.06 $0.000060
Output $0.40 $0.000400

Live Performance

Live endpoint metrics, refreshed every 30 minutes.

83%
Avg Uptime
407ms
Best Latency (TTFT)
71 tok/s
Best Throughput
5/6
Active Endpoints
Available via: DeepInfra, Cloudflare, Z.AI, Novita, Phala, Venice

Leaderboard Categories

Frequently asked questions about Z.ai: GLM 4.7 Flash

How much does Z.ai: GLM 4.7 Flash cost?

Z.ai: GLM 4.7 Flash costs $0.06 per million input tokens and $0.40 per million output tokens.

What is the context window of Z.ai: GLM 4.7 Flash?

Z.ai: GLM 4.7 Flash has a context window of 202,752 tokens (203K).

Is Z.ai: GLM 4.7 Flash good for coding?

On our coding benchmark index, Z.ai: GLM 4.7 Flash ranks #102 of 307 models, placing it in the broader range of the field for code generation and debugging.

What can Z.ai: GLM 4.7 Flash do?

Z.ai: GLM 4.7 Flash supports tool use and function calling.

Who created Z.ai: GLM 4.7 Flash?

Z.ai: GLM 4.7 Flash is developed by Z.ai and was released on January 19, 2026.