Anthropic: Claude 3.7 Sonnet

Anthropic: Claude 3.7 Sonnet

anthropic · Released Feb 24, 2025 Specialist
Intelligence #67 / 523
65.9 Our Score
AA Index #87 / 351
30.8 Artificial Analysis
Input #473 / 523
$3.00 per 1M tokens
Output #481 / 523
$15.00 per 1M tokens
Context #144 / 523
200,000 tokens

Analysis Summary

Anthropic: Claude 3.7 Sonnet sits in the Specialist tier on our leaderboard, ranked #67 of 523 published models on overall intelligence. At $3.00 input and $15.00 output per 1M tokens, it is among the most expensive on the market. It offers a generous context window for extended reasoning and code review and supports tool use, function calling, vision, and reasoning.

Editorial notes

Claude 3.7 Sonnet is a well-rounded model from Anthropic with solid reasoning, coding, and agentic capabilities, plus vision and a 200K context window. It's a reliable choice for businesses needing a capable all-rounder from a trusted provider, though it sits below the current generation of flagship models.

Assessed April 23, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence5.5Technical4.8Value6Content6.5
Intelligence 5.5/10
Technical 4.8/10
Content 6.5/10
Value 6/10

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and..

Capabilities

Tool Use Function Calling Vision

Performance Indices

Source: Artificial Analysis

30.8 Intelligence Index
26.7 Coding Index
35.6 Agentic Index
21 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 65.6% Graduate-level scientific reasoning
HLE 4.8% Humanity's Last Exam
MMLU Pro 80.3% Multi-task language understanding
MATH 500 85% Mathematical problem-solving
AIME 22.3% Competition mathematics
AIME 2025 21% Competition mathematics (2025)
SciCode 37.6% Scientific computing

Technical

LiveCodeBench 39.4% Live coding evaluation
TerminalBench Hard 21.2% Agentic terminal tasks
τ²-Bench 50% Conversational agent benchmark

Content

IFBench 44% Instruction following
LCR 48.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Anthropic: Claude 3.7 Sonnet stack up?

Compare side-by-side with other specialist models.

Compare Models

Model Information

OpenRouter ID anthropic/claude-3.7-sonnet
Provideranthropic
Model FamilyClaude 3
Release Date February 24, 2025
Context Length200,000 tokens
Max Completion128,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

99.1%
Avg Uptime
779ms
Best Latency (TTFT)
42 tok/s
Best Throughput
2/4
Active Endpoints
Available via: Amazon Bedrock, Google

Leaderboard Categories