Qwen3 4B 2507 Instruct

Qwen3 4B 2507 Instruct

Alibaba · Released Aug 6, 2025 Professional
Intelligence #9 / 579
82.0 Our Score
AA Index #276 / 380
7.1 Artificial Analysis
Input
— Not priced
Output
— Not priced
Context
— Not reported

Analysis Summary

Qwen3 4B 2507 Instruct is a mid-2025 refresh of Alibaba's 4B parameter line, showing meaningful gains in math (52.3 math index) and instruction following (IFBench 0.335) compared to the original Qwen3 4B release. Coding capability has improved to a 9 coding index, and the model shows some terminal and agentic awareness, though both remain well below business-grade thresholds.

For businesses, this model is best suited to structured content tasks: templated writing, form completion, basic summarisation, and lightweight classification. The improved instruction following makes it more reliable for prompt-driven workflows than its predecessor, but it cannot handle complex codebases, autonomous agents, or deep document reasoning.

No pricing data is available for this variant, which makes cost-benefit analysis difficult. Teams already using Qwen3 4B should consider this update for instruction-sensitive tasks, but organisations with more demanding requirements should look to larger models in the Qwen3 family or frontier alternatives.

Assessed June 17, 2026

Editorial notes

Qwen3 4B 2507 Instruct is an updated Alibaba small model with improved math and instruction following over its predecessor, but coding and agentic scores remain limited for professional workflows.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2Technical1.7Value0Content2.3
Intelligence 2/10
Technical 1.7/10
Content 2.3/10
Value 0/10

How Qwen3 4B 2507 Instruct compares

Qwen3 4B 2507 Instruct ranks #276 of 380 AI models we track for overall intelligence, #254 of 317 for coding, #216 of 292 for agentic tasks. Qwen3 4B 2507 Instruct is currently free to use via OpenRouter.

Performance Indices

Source: Artificial Analysis

7.1 Intelligence Index
9 Coding Index
15.6 Agentic Index
52.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 51.7% Graduate-level scientific reasoning
HLE 4.7% Humanity's Last Exam
MMLU Pro 67.2% Multi-task language understanding
AIME 2025 52.3% Competition mathematics (2025)
SciCode 18.1% Scientific computing

Technical

LiveCodeBench 37.7% Live coding evaluation
TerminalBench Hard 4.5% Agentic terminal tasks
τ²-Bench 26.6% Conversational agent benchmark

Content

IFBench 33.5% Instruction following
LCR 7.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen3 4B 2507 Instruct stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

ProviderAlibaba
Release Date August 6, 2025
Status Active

Leaderboard Categories

Frequently asked questions about Qwen3 4B 2507 Instruct

How much does Qwen3 4B 2507 Instruct cost?

Qwen3 4B 2507 Instruct is currently available for free via OpenRouter.

Is Qwen3 4B 2507 Instruct good for coding?

On our coding benchmark index, Qwen3 4B 2507 Instruct ranks #254 of 317 models, placing it in the broader range of the field for code generation and debugging.

Who created Qwen3 4B 2507 Instruct?

Qwen3 4B 2507 Instruct is developed by Alibaba and was released on August 6, 2025.