Z.ai: GLM 4.5V

Z.ai: GLM 4.5V

z-ai · Released Aug 11, 2025
30
Our Score

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding, image Q&A, OCR, and document parsing, with strong gains in front-end web coding, grounding, and spatial reasoning. It offers a hybrid inference mode: a "thinking mode" for deep reasoning and a "non-thinking mode" for fast responses. Reasoning behavior can be toggled via the reasoning enabled boolean. Learn more in our docs

$0.60 / 1M Input Price
$1.80 / 1M Output Price
65,536 tokens Context Window
16,384 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

12.7 Intelligence Index
10.8 Coding Index
13.2 Agentic Index
15.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 57.3%
Graduate-level scientific reasoning
HLE 3.6%
Humanity's Last Exam
MMLU Pro 75.1%
Multi-task language understanding
LiveCodeBench 35.2%
Live coding evaluation
SciCode 18.8%
Scientific computing
AIME 2025 15.3%
Competition mathematics (2025)
IFBench 28.6%
Instruction following
TerminalBench Hard 6.8%
Agentic terminal tasks
τ²-Bench 19.6%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.5v
Providerz-ai
Release Date August 11, 2025
Context Length65,536 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.60 $0.000600
Output $1.80 $0.001800

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

1,252ms
Best Latency (TTFT)
50 tok/s
Best Throughput
0/2
Active Endpoints
Available via: Novita, Z.AI