Qwen2.5 72B Instruct

Qwen2.5 72B Instruct

qwen · Released Sep 19, 2024
50
Our Score

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.

$0.12 / 1M Input Price
$0.39 / 1M Output Price
32,768 tokens Context Window
16,384 tokens Max Output
72B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerQwen
Instruct Typechatml
Parameters72B

Performance Indices

Source: Artificial Analysis

15.6 Intelligence Index
11.9 Coding Index
19.5 Agentic Index
14 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 49.1%
Graduate-level scientific reasoning
HLE 4.2%
Humanity's Last Exam
MMLU Pro 72%
Multi-task language understanding
LiveCodeBench 27.6%
Live coding evaluation
SciCode 26.7%
Scientific computing
MATH 500 85.8%
Mathematical problem-solving
AIME 16%
Competition mathematics
AIME 2025 14%
Competition mathematics (2025)
IFBench 36.9%
Instruction following
LCR 20.3%
Long-context reasoning
TerminalBench Hard 4.5%
Agentic terminal tasks
τ²-Bench 34.5%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID qwen/qwen-2.5-72b-instruct
Providerqwen
Release Date September 19, 2024
Context Length32,768 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.12 $0.000120
Output $0.39 $0.000390

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.9%
Avg Uptime
705ms
Best Latency (TTFT)
38 tok/s
Best Throughput
2/2
Active Endpoints
Available via: DeepInfra, Novita

Leaderboard Categories