Anthropic: Claude Sonnet 4

Anthropic: Claude Sonnet 4

anthropic · Released May 22, 2025
75
Our Score

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

$3.00 / 1M Input Price
$15.00 / 1M Output Price
200,000 tokens Context Window
64,000 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image + File → Text
TokenizerClaude

Performance Indices

Source: Artificial Analysis

33 Intelligence Index
30.6 Coding Index
39.8 Agentic Index
38 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 68.3%
Graduate-level scientific reasoning
HLE 4%
Humanity's Last Exam
MMLU Pro 83.7%
Multi-task language understanding
LiveCodeBench 44.9%
Live coding evaluation
SciCode 37.3%
Scientific computing
MATH 500 93.4%
Mathematical problem-solving
AIME 40.7%
Competition mathematics
AIME 2025 38%
Competition mathematics (2025)
IFBench 45.4%
Instruction following
LCR 44.3%
Long-context reasoning
TerminalBench Hard 27.3%
Agentic terminal tasks
τ²-Bench 52.3%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID anthropic/claude-sonnet-4
Provideranthropic
Release Date May 22, 2025
Context Length200,000 tokens
Max Completion64,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.7%
Avg Uptime
749ms
Best Latency (TTFT)
43 tok/s
Best Throughput
3/5
Active Endpoints
Available via: Amazon Bedrock, Google, Anthropic

Leaderboard Categories