Home > AI Models > Qwen3 4B 2507 (Reasoning)

Qwen3 4B 2507 (Reasoning)

Name: Qwen3 4B 2507 (Reasoning) Review
Item: Qwen3 4B 2507 (Reasoning)
Rating: 2.4
Author: Design for Online

Qwen3 4B 2507 (Reasoning)

Alibaba · Released Aug 6, 2025 Emerging

Intelligence #385 / 561

24.3 Our Score

AA Index #183 / 368

18.2 Artificial Analysis

Input

— Not priced

Output

— Not priced

Context

— Not reported

Qwen3 4B 2507 (Reasoning) sits in the Emerging tier on our leaderboard, ranked #385 of 561 published models on overall intelligence. At $0.000 input and $0.000 output per 1M tokens, it is among the most expensive on the market.

Editorial notes

Qwen3 4B 2507 Reasoning is a compact model with exceptional math scores and strong livecodebench performance for its size, though overall intelligence and agentic capability remain limited.

Assessed May 14, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Performance Indices

Source: Artificial Analysis

18.2 Intelligence Index

9.5 Coding Index

13.5 Agentic Index

82.7 Math Index

Benchmark Scores

GPQA Diamond 66.7% Graduate-level scientific reasoning

HLE 5.9% Humanity's Last Exam

MMLU Pro 74.3% Multi-task language understanding

AIME 2025 82.7% Competition mathematics (2025)

SciCode 25.6% Scientific computing

LiveCodeBench 64.1% Live coding evaluation

TerminalBench Hard 1.5% Agentic terminal tasks

τ²-Bench 25.4% Conversational agent benchmark

IFBench 49.8% Instruction following

LCR 37.7% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen3 4B 2507 (Reasoning) stack up?

Compare side-by-side with other emerging models.

Compare Models

Model Information

Provider	Alibaba
Release Date	August 6, 2025
Status	Active

Qwen3 4B 2507 (Reasoning)

Qwen3 4B 2507 (Reasoning)

Analysis Summary

Performance Profile

Performance Indices

Benchmark Scores

Intelligence

Technical

Content

Model Information

Qwen3 4B 2507 (Reasoning)

Performance Profile

Performance Indices

Benchmark Scores

Intelligence

Technical

Content

Model Information

Explore Related Models