Z.ai: GLM 4.5 Air

Z.ai: GLM 4.5 Air

z-ai · Released Jul 25, 2025
58
Our Score

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

$0.13 / 1M Input Price
$0.85 / 1M Output Price
131,072 tokens Context Window
98,304 tokens Max Output

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

23.2 Intelligence Index
23.8 Coding Index
33.5 Agentic Index
80.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 73.3%
Graduate-level scientific reasoning
HLE 6.8%
Humanity's Last Exam
MMLU Pro 81.5%
Multi-task language understanding
LiveCodeBench 68.4%
Live coding evaluation
SciCode 30.6%
Scientific computing
MATH 500 96.5%
Mathematical problem-solving
AIME 67.3%
Competition mathematics
AIME 2025 80.7%
Competition mathematics (2025)
IFBench 37.6%
Instruction following
LCR 43.7%
Long-context reasoning
TerminalBench Hard 20.5%
Agentic terminal tasks
τ²-Bench 46.5%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.5-air
Providerz-ai
Release Date July 25, 2025
Context Length131,072 tokens
Max Completion98,304 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.13 $0.000130
Output $0.85 $0.000850

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.8%
Avg Uptime
338ms
Best Latency (TTFT)
71 tok/s
Best Throughput
2/4
Active Endpoints
Available via: Novita, SiliconFlow, Z.AI, Nebius

Leaderboard Categories