Z.ai: GLM 4.5V

Z.ai: GLM 4.5V

z-ai · Released Aug 11, 2025 Efficient
39.6
Our Score

Performance Profile

Intelligence3Technical2.2Value7Content3
Intelligence 3/10
Technical 2.2/10
Content 3/10
Value 7/10

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding, image Q&A, OCR, and document parsing, with strong gains in front-end web coding, grounding, and spatial reasoning. It offers a hybrid inference mode: a "thinking mode" for deep reasoning and a "non-thinking mode" for fast responses. Reasoning behavior can be toggled via the reasoning enabled boolean. Learn more in our docs

$0.60 / 1M
Input Price
$1.80 / 1M
Output Price
65,536 tokens
Context Window
16,384 tokens
Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image → Text
TokenizerOther

Performance Indices

Source: Artificial Analysis

12.7 Intelligence Index
10.8 Coding Index
13.2 Agentic Index
15.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 57.3% Graduate-level scientific reasoning
HLE 3.6% Humanity's Last Exam
MMLU Pro 75.1% Multi-task language understanding
AIME 2025 15.3% Competition mathematics (2025)
SciCode 18.8% Scientific computing

Technical

LiveCodeBench 35.2% Live coding evaluation
TerminalBench Hard 6.8% Agentic terminal tasks
τ²-Bench 19.6% Conversational agent benchmark

Content

IFBench 28.6% Instruction following

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID z-ai/glm-4.5v
Providerz-ai
Release Date August 11, 2025
Context Length65,536 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.60 $0.000600
Output $1.80 $0.001800

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

950ms
Best Latency (TTFT)
63 tok/s
Best Throughput
0/2
Active Endpoints
Available via: Novita, Z.AI