Anthropic: Claude Opus 4

Anthropic: Claude Opus 4

anthropic · Released May 22, 2025
68
Our Score

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.

$15.00 / 1M Input Price
$75.00 / 1M Output Price
200,000 tokens Context Window
32,000 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image + File → Text
TokenizerClaude

Performance Indices

Source: Artificial Analysis

27.4 Intelligence Index
34 Coding Index
52.3 Agentic Index
73.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 79.6%
Graduate-level scientific reasoning
HLE 11.7%
Humanity's Last Exam
MMLU Pro 87.3%
Multi-task language understanding
LiveCodeBench 63.6%
Live coding evaluation
SciCode 39.8%
Scientific computing
MATH 500 98.2%
Mathematical problem-solving
AIME 75.7%
Competition mathematics
AIME 2025 73.3%
Competition mathematics (2025)
IFBench 53.7%
Instruction following
LCR 33.7%
Long-context reasoning
TerminalBench Hard 31.1%
Agentic terminal tasks
τ²-Bench 73.4%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID anthropic/claude-opus-4
Provideranthropic
Release Date May 22, 2025
Context Length200,000 tokens
Max Completion32,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $15.00 $0.015000
Output $75.00 $0.075000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
1,362ms
Best Latency (TTFT)
4 tok/s
Best Throughput
1/4
Active Endpoints
Available via: Amazon Bedrock, Google, Anthropic

Leaderboard Categories