Anthropic: Claude 3.7 Sonnet (thinking)

Anthropic: Claude 3.7 Sonnet (thinking)

anthropic · Released Feb 24, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
AA Index #80 / 377
34.7 Artificial Analysis
Input #516 / 576
$3.00 per 1M tokens
Output #524 / 576
$15.00 per 1M tokens
Context #198 / 576
200,000 tokens

Analysis Summary

Claude 3.7 Sonnet (thinking) is the extended-reasoning variant of Anthropic's Sonnet 3.7, with an intelligence index of 34.7 and stronger math (56.3) and GPQA (0.772) scores than its standard sibling. It retains full vision, tool use, and function calling support alongside a 200K context window, making it one of the more capable mid-tier models for complex analytical and coding tasks.

For businesses, the thinking mode makes it particularly well-suited to multi-step reasoning tasks, technical problem-solving, and scenarios where answer quality is critical. Its long-context recall score of 0.607 is strong, supporting detailed document analysis. Instruction-following is moderate, so structured output pipelines may need prompt engineering attention.

Pricing matches the standard Sonnet 3.7 at $3.00 input and $15.00 output per million tokens, which is a meaningful cost for high-volume use. It is best deployed on high-value tasks where the reasoning uplift justifies the spend, rather than routine content generation.

Assessed June 6, 2026

Editorial notes

Claude 3.7 Sonnet (thinking) from Anthropic adds extended reasoning to an already capable base, with strong math, coding, and long-context performance across a 200K window with vision and tool use.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence5.2Technical4.7Value6Content6.3
Intelligence 5.2/10
Technical 4.7/10
Content 6.3/10
Value 6/10

How Anthropic: Claude 3.7 Sonnet (thinking) compares

Anthropic: Claude 3.7 Sonnet (thinking) ranks #80 of 377 AI models we track for overall intelligence, #98 of 314 for coding, #114 of 289 for agentic tasks. Its 200K-token context window is larger than 66% of the models we list. At $3.00 per million input tokens it is cheaper than 10% of comparable models.

About Anthropic: Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and..

Capabilities

Tool Use Function Calling Vision

Performance Indices

Source: Artificial Analysis

34.7 Intelligence Index
27.6 Coding Index
37.9 Agentic Index
56.3 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 77.2% Graduate-level scientific reasoning
HLE 10.3% Humanity's Last Exam
MMLU Pro 83.7% Multi-task language understanding
MATH 500 94.7% Mathematical problem-solving
AIME 48.7% Competition mathematics
AIME 2025 56.3% Competition mathematics (2025)
SciCode 40.3% Scientific computing

Technical

LiveCodeBench 47.3% Live coding evaluation
TerminalBench Hard 21.2% Agentic terminal tasks
τ²-Bench 54.7% Conversational agent benchmark

Content

IFBench 48.3% Instruction following
LCR 60.7% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does Anthropic: Claude 3.7 Sonnet (thinking) stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID anthropic/claude-3.7-sonnet:thinking
Provideranthropic
Model FamilyClaude 3
Release Date February 24, 2025
Context Length200,000 tokens
Max Completion64,000 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $3.00 $0.003000
Output $15.00 $0.015000

Frequently asked questions about Anthropic: Claude 3.7 Sonnet (thinking)

How much does Anthropic: Claude 3.7 Sonnet (thinking) cost?

Anthropic: Claude 3.7 Sonnet (thinking) costs $3.00 per million input tokens and $15.00 per million output tokens.

What is the context window of Anthropic: Claude 3.7 Sonnet (thinking)?

Anthropic: Claude 3.7 Sonnet (thinking) has a context window of 200,000 tokens (200K).

Is Anthropic: Claude 3.7 Sonnet (thinking) good for coding?

On our coding benchmark index, Anthropic: Claude 3.7 Sonnet (thinking) ranks #98 of 314 models, placing it in the broader range of the field for code generation and debugging.

What can Anthropic: Claude 3.7 Sonnet (thinking) do?

Anthropic: Claude 3.7 Sonnet (thinking) supports image/vision input, tool use, and function calling.

Who created Anthropic: Claude 3.7 Sonnet (thinking)?

Anthropic: Claude 3.7 Sonnet (thinking) is developed by Anthropic and was released on February 24, 2025.