Qwen: Qwen3 8B

Qwen: Qwen3 8B

qwen · Released Apr 28, 2025 Efficient
35.1
Our Score

Performance Profile

Intelligence2.5Technical1.7Value7.5Content2.5
Intelligence 2.5/10
Technical 1.7/10
Content 2.5/10
Value 7.5/10

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to 131K tokens with YaRN scaling.

$0.05 / 1M
Input Price
$0.40 / 1M
Output Price
40,960 tokens
Context Window
8,192 tokens
Max Output
8B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerQwen3
Instruct Typeqwen3
Parameters8B

Performance Indices

Source: Artificial Analysis

10.6 Intelligence Index
7.1 Coding Index
13.6 Agentic Index
24.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 45.2% Graduate-level scientific reasoning
HLE 2.8% Humanity's Last Exam
MMLU Pro 64.3% Multi-task language understanding
MATH 500 82.8% Mathematical problem-solving
AIME 24.3% Competition mathematics
AIME 2025 24.3% Competition mathematics (2025)
SciCode 16.8% Scientific computing

Technical

LiveCodeBench 20.2% Live coding evaluation
TerminalBench Hard 2.3% Agentic terminal tasks
τ²-Bench 24.9% Conversational agent benchmark

Content

IFBench 28.6% Instruction following

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID qwen/qwen3-8b
Providerqwen
Release Date April 28, 2025
Context Length40,960 tokens
Max Completion8,192 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.05 $0.000050
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99%
Avg Uptime
362ms
Best Latency (TTFT)
79 tok/s
Best Throughput
2/2
Active Endpoints
Available via: AtlasCloud, Alibaba