Qwen: Qwen3 8B

Qwen: Qwen3 8B

qwen · Released Apr 28, 2025
30
Our Score

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to 131K tokens with YaRN scaling.

$0.05 / 1M Input Price
$0.40 / 1M Output Price
40,960 tokens Context Window
8,192 tokens Max Output
8B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerQwen3
Instruct Typeqwen3
Parameters8B

Performance Indices

Source: Artificial Analysis

10.6 Intelligence Index
7.1 Coding Index
13.6 Agentic Index
24.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 45.2%
Graduate-level scientific reasoning
HLE 2.8%
Humanity's Last Exam
MMLU Pro 64.3%
Multi-task language understanding
LiveCodeBench 20.2%
Live coding evaluation
SciCode 16.8%
Scientific computing
MATH 500 82.8%
Mathematical problem-solving
AIME 24.3%
Competition mathematics
AIME 2025 24.3%
Competition mathematics (2025)
IFBench 28.6%
Instruction following
TerminalBench Hard 2.3%
Agentic terminal tasks
τ²-Bench 24.9%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID qwen/qwen3-8b
Providerqwen
Release Date April 28, 2025
Context Length40,960 tokens
Max Completion8,192 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.05 $0.000050
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
418ms
Best Latency (TTFT)
86 tok/s
Best Throughput
2/2
Active Endpoints
Available via: AtlasCloud, Alibaba