Qwen3.5 4B (Reasoning)

Qwen3.5 4B (Reasoning)

Alibaba · Released Mar 2, 2026 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #34 / 271
203.6 tokens / sec
Input #139 / 577
$0.030 per 1M tokens
Output #162 / 577
$0.150 per 1M tokens
Context
— Not reported

Analysis Summary

Qwen3.5 4B (Reasoning) is Alibaba's small reasoning-capable model, and its benchmark profile punches above its size class in several areas. An agentic index of 55.1, GPQA of 0.771, and long-context reliability of 0.557 are all strong for a 4B-parameter model, and the reasoning mode delivers meaningful uplift on structured tasks. Its coding index of 17.5 is a limitation for technical workloads.

For businesses, this model suits cost-sensitive pipelines requiring reasoning capability: structured content generation, instruction-following tasks, and lightweight agentic coordination. At $0.03 input and $0.15 output per million tokens, it offers exceptional cost efficiency for the reasoning capability it provides. It is not suited to heavy coding or complex multi-step tool use.

A -4 point regional penalty applies given the provider's limited enterprise footprint. Teams running high-volume reasoning pipelines on tight budgets will find this one of the most cost-efficient options available, provided the coding limitation is not a constraint.

Assessed June 6, 2026

Editorial notes

Qwen3.5 4B (Reasoning) from Alibaba offers a strong agentic index of 55.1 and GPQA of 0.771 at just $0.03/$0.15 per million tokens, with limited coding depth.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence4Technical4.1Value7.3Content6.2
Intelligence 4/10
Technical 4.1/10
Content 6.2/10
Value 7.3/10

How Qwen3.5 4B (Reasoning) compares

Qwen3.5 4B (Reasoning) ranks #126 of 378 AI models we track for overall intelligence, #161 of 315 for coding, #68 of 289 for agentic tasks. At $0.03 per million input tokens it is cheaper than 76% of comparable models.

Performance Indices

Source: Artificial Analysis

27.1 Intelligence Index
17.5 Coding Index
55.1 Agentic Index

Benchmark Scores

Intelligence

GPQA Diamond 77.1% Graduate-level scientific reasoning
HLE 7.8% Humanity's Last Exam
SciCode 16.1% Scientific computing

Technical

TerminalBench Hard 18.2% Agentic terminal tasks
τ²-Bench 92.1% Conversational agent benchmark

Content

IFBench 52% Instruction following
LCR 55.7% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen3.5 4B (Reasoning) stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

ProviderAlibaba
Release Date March 2, 2026
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.03 $0.000030
Output $0.15 $0.000150

Leaderboard Categories

Frequently asked questions about Qwen3.5 4B (Reasoning)

How much does Qwen3.5 4B (Reasoning) cost?

Qwen3.5 4B (Reasoning) costs $0.03 per million input tokens and $0.15 per million output tokens.

Is Qwen3.5 4B (Reasoning) good for coding?

On our coding benchmark index, Qwen3.5 4B (Reasoning) ranks #161 of 315 models, placing it in the broader range of the field for code generation and debugging.

Who created Qwen3.5 4B (Reasoning)?

Qwen3.5 4B (Reasoning) is developed by Alibaba and was released on March 2, 2026.