Anthropic: Claude 3.7 Sonnet (thinking)

Anthropic: Claude 3.7 Sonnet (thinking)

anthropic · Released Feb 24, 2025
70
Our Score

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes. Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks.

$3.00 / 1M Input Price
$15.00 / 1M Output Price
200,000 tokens Context Window
64,000 tokens Max Output

Capabilities

Tool Use Function Calling Vision

Architecture

ModalityText + Image + File → Text
TokenizerClaude

Performance Indices

Source: Artificial Analysis

34.7 Intelligence Index
27.6 Coding Index
38 Agentic Index
56.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 77.2%
Graduate-level scientific reasoning
HLE 10.3%
Humanity's Last Exam
MMLU Pro 83.7%
Multi-task language understanding
LiveCodeBench 47.3%
Live coding evaluation
SciCode 40.3%
Scientific computing
MATH 500 94.7%
Mathematical problem-solving
AIME 48.7%
Competition mathematics
AIME 2025 56.3%
Competition mathematics (2025)
IFBench 48.3%
Instruction following
LCR 60.7%
Long-context reasoning
TerminalBench Hard 21.2%
Agentic terminal tasks
τ²-Bench 54.7%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID anthropic/claude-3.7-sonnet:thinking
Provideranthropic
Model FamilyClaude 3
Release Date February 24, 2025
Context Length200,000 tokens
Max Completion64,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
1,955ms
Best Latency (TTFT)
43 tok/s
Best Throughput
1/1
Active Endpoints
Available via: Google

Leaderboard Categories