Qwen3 4B (Reasoning)

Qwen3 4B (Reasoning)

Alibaba · Released Apr 28, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #121 / 271
103.9 tokens / sec
Input #223 / 577
$0.110 per 1M tokens
Output #350 / 577
$1.26 per 1M tokens
Context
— Not reported

Analysis Summary

Qwen3 4B (Reasoning) is Alibaba's compact thinking-mode model, released in April 2025. With a GPQA of 0.522 and livecodebench of 0.465, it punches above its weight for a 4-billion-parameter model. Its agentic index of 19 is reasonable, and instruction-following (ifbench: 0.325) is adequate for structured tasks.

For businesses, it suits cost-sensitive workflows: SEO content drafting, structured data extraction, and lightweight reasoning tasks where a larger model would be overkill. Its coding capability is not benchmarked separately, and long-context data is absent, so it should not be relied upon for complex codebases or long-document analysis.

At $0.11 input and $1.26 output per million tokens, it is among the most affordable reasoning-capable models available. Teams running high-volume, lower-complexity tasks will find strong price-performance here, with the caveat that Alibaba's API accessibility may require additional integration work for enterprise environments.

Assessed June 6, 2026

Editorial notes

Qwen3 4B (Reasoning) from Alibaba offers strong GPQA and livecodebench scores for a 4B model at very low cost, making it a practical choice for structured content and reasoning tasks on a tight budget.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2.7Technical2.8Value7Content3.6
Intelligence 2.7/10
Technical 2.8/10
Content 3.6/10
Value 7/10

How Qwen3 4B (Reasoning) compares

Qwen3 4B (Reasoning) ranks #251 of 378 AI models we track for overall intelligence, #185 of 289 for agentic tasks. At $0.11 per million input tokens it is cheaper than 61% of comparable models.

Performance Indices

Source: Artificial Analysis

14.2 Intelligence Index
19 Agentic Index
22.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 52.2% Graduate-level scientific reasoning
HLE 5.1% Humanity's Last Exam
MMLU Pro 69.6% Multi-task language understanding
MATH 500 93.3% Mathematical problem-solving
AIME 65.7% Competition mathematics
AIME 2025 22.3% Competition mathematics (2025)
SciCode 3.5% Scientific computing

Technical

LiveCodeBench 46.5% Live coding evaluation
τ²-Bench 19% Conversational agent benchmark

Content

IFBench 32.5% Instruction following

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen3 4B (Reasoning) stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

ProviderAlibaba
Release Date April 28, 2025
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.11 $0.000110
Output $1.26 $0.001260

Leaderboard Categories

Frequently asked questions about Qwen3 4B (Reasoning)

How much does Qwen3 4B (Reasoning) cost?

Qwen3 4B (Reasoning) costs $0.11 per million input tokens and $1.26 per million output tokens.

Who created Qwen3 4B (Reasoning)?

Qwen3 4B (Reasoning) is developed by Alibaba and was released on April 28, 2025.