Qwen: Qwen3 VL 8B Thinking

Qwen: Qwen3 VL 8B Thinking

qwen · Released Oct 14, 2025
38
Our Score

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and long-context processing (native 256K, expandable to 1M tokens) for tasks such as scientific visual analysis, causal inference, and mathematical reasoning over image or video inputs. Compared to the Instruct edition, the Thinking version introduces deeper visual-language fusion and deliberate reasoning pathways that improve performance on long-chain logic tasks, STEM problem-solving, and multi-step video understanding. It achieves stronger temporal grounding via Interleaved-MRoPE and timestamp-aware embeddings, while maintaining robust OCR, multilingual comprehension, and text generation on par with large text-only LLMs.

$0.12 / 1M Input Price
$1.37 / 1M Output Price
131,072 tokens Context Window
32,768 tokens Max Output
8B Parameters

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image → Text
TokenizerQwen3
Parameters8B

Performance Indices

Source: Artificial Analysis

16.7 Intelligence Index
9.8 Coding Index
13.2 Agentic Index
30.7 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 57.9%
Graduate-level scientific reasoning
HLE 3.3%
Humanity's Last Exam
MMLU Pro 74.9%
Multi-task language understanding
LiveCodeBench 35.3%
Live coding evaluation
SciCode 21.9%
Scientific computing
AIME 2025 30.7%
Competition mathematics (2025)
IFBench 39.9%
Instruction following
LCR 31%
Long-context reasoning
TerminalBench Hard 3.8%
Agentic terminal tasks
τ²-Bench 22.5%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID qwen/qwen3-vl-8b-thinking
Providerqwen
Release Date October 14, 2025
Context Length131,072 tokens
Max Completion32,768 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.12 $0.000117
Output $1.37 $0.001365

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

1,005ms
Best Latency (TTFT)
89.5 tok/s
Best Throughput
0/1
Active Endpoints
Available via: Alibaba