Qwen2.5 72B Instruct

Qwen2.5 72B Instruct

qwen · Released Sep 19, 2024 Efficient
Intelligence #199 / 557
41.3 Our Score
Speed #190 / 259
54.9 tokens / sec
Input #345 / 560
$0.360 per 1M tokens
Output #219 / 560
$0.400 per 1M tokens
Context #222 / 560
131,072 tokens

Analysis Summary

Qwen2.5 72B Instruct sits in the Efficient tier on our leaderboard, ranked #199 of 557 published models on overall intelligence. At $0.360 input and $0.400 output per 1M tokens, it is among the most expensive on the market. It offers a standard large context window and supports tool use and function calling.

Editorial notes

Qwen2.5 72B Instruct offers moderate general knowledge at low cost with tool use and function calling, but benchmark scores are limited and it has been superseded by newer Qwen generations.

Assessed May 17, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence3.3Technical2.5Value7.8Content3
Intelligence 3.3/10
Technical 2.5/10
Content 3/10
Value 7.8/10

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and..

72B Parameters

Capabilities

Tool Use Function Calling

Architecture Detail

Instruct Typechatml

Performance Indices

Source: Artificial Analysis

15.6 Intelligence Index
11.9 Coding Index
19.5 Agentic Index
14 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 49.1% Graduate-level scientific reasoning
HLE 4.2% Humanity's Last Exam
MMLU Pro 72% Multi-task language understanding
MATH 500 85.8% Mathematical problem-solving
AIME 16% Competition mathematics
AIME 2025 14% Competition mathematics (2025)
SciCode 26.7% Scientific computing

Technical

LiveCodeBench 27.6% Live coding evaluation
TerminalBench Hard 4.5% Agentic terminal tasks
τ²-Bench 34.5% Conversational agent benchmark

Content

IFBench 36.9% Instruction following
LCR 20.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Qwen2.5 72B Instruct stack up?

Compare side-by-side with other efficient models.

Compare Models

Model Information

OpenRouter ID qwen/qwen-2.5-72b-instruct
Providerqwen
Release Date September 19, 2024
Context Length131,072 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.36 $0.000360
Output $0.40 $0.000400

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
1,055ms
Best Latency (TTFT)
21 tok/s
Best Throughput
2/2
Active Endpoints
Available via: DeepInfra, Novita

Leaderboard Categories