Qwen3 VL 4B (Reasoning)

Qwen3 VL 4B (Reasoning)

Alibaba · Released Oct 14, 2025 Emerging
Intelligence #468 / 556
18.3 Our Score
AA Index #244 / 365
13.7 Artificial Analysis
Input
Not priced
Output
Not priced
Context
Not reported

Analysis Summary

Qwen3 VL 4B (Reasoning) sits in the Emerging tier on our leaderboard, ranked #468 of 556 published models on overall intelligence. At $0.000 input and $0.000 output per 1M tokens, it is among the most expensive on the market.

Editorial notes

Qwen3 VL 4B Reasoning adds vision and reasoning modes to a compact base but scores poorly on agentic and terminal benchmarks, limiting its utility for structured business tasks.

Assessed May 14, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence3Technical1.3Value0Content3.5
Intelligence 3/10
Technical 1.3/10
Content 3.5/10
Value 0/10

Performance Indices

Source: Artificial Analysis

13.7 Intelligence Index
6.7 Coding Index
8.5 Agentic Index
25.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 49.4% Graduate-level scientific reasoning
HLE 4.4% Humanity's Last Exam
MMLU Pro 70% Multi-task language understanding
AIME 2025 25.7% Competition mathematics (2025)
SciCode 17.1% Scientific computing

Technical

LiveCodeBench 32% Live coding evaluation
TerminalBench Hard 1.5% Agentic terminal tasks
τ²-Bench 15.5% Conversational agent benchmark

Content

IFBench 36.6% Instruction following
LCR 21.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen3 VL 4B (Reasoning) stack up?

Compare side-by-side with other emerging models.

Compare Models

Model Information

ProviderAlibaba
Release Date October 14, 2025
Status Active