Z.ai: GLM 4.5

Z.ai: GLM 4.5

z-ai · Released Jul 25, 2025
65
Our Score

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly enhanced capabilities in reasoning, code generation, and agent alignment. It supports a hybrid inference mode with two options, a "thinking mode" designed for complex reasoning and tool use, and a "non-thinking mode" optimized for instant responses. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

$0.60 / 1M Input Price
$2.20 / 1M Output Price
131,072 tokens Context Window
98,304 tokens Max Output

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

26.4 Intelligence Index
26.3 Coding Index
32.5 Agentic Index
73.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 78.2%
Graduate-level scientific reasoning
HLE 12.2%
Humanity's Last Exam
MMLU Pro 83.5%
Multi-task language understanding
LiveCodeBench 73.8%
Live coding evaluation
SciCode 34.8%
Scientific computing
MATH 500 97.9%
Mathematical problem-solving
AIME 87.3%
Competition mathematics
AIME 2025 73.7%
Competition mathematics (2025)
IFBench 44.1%
Instruction following
LCR 48.3%
Long-context reasoning
TerminalBench Hard 22%
Agentic terminal tasks
τ²-Bench 43%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.5
Providerz-ai
Release Date July 25, 2025
Context Length131,072 tokens
Max Completion98,304 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.60 $0.000600
Output $2.20 $0.002200

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

97.5%
Avg Uptime
1,526ms
Best Latency (TTFT)
32 tok/s
Best Throughput
1/3
Active Endpoints
Available via: Novita, Z.AI, Nebius

Leaderboard Categories