DeepSeek: R1 Distill Llama 70B

DeepSeek: R1 Distill Llama 70B

deepseek · Released Jan 23, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #245 / 271
39.2 tokens / sec
Input #419 / 577
$0.800 per 1M tokens
Output #302 / 577
$0.800 per 1M tokens
Context #328 / 577
128,000 tokens

Analysis Summary

DeepSeek R1 Distill Llama 70B is a distilled reasoning model built on a Llama 70B backbone, released by DeepSeek in January 2025. It shows strong mathematical reasoning relative to its price point, with an AIME score above 0.53, but its coding and agentic indices are low, and instruction-following metrics are weak.

For businesses, this model suits math-heavy or analytical tasks where cost is a priority and agentic reliability is not required. It is not well-suited to autonomous workflows, structured content generation, or tool-use pipelines, where its low agentic and ifbench scores would create friction.

At $0.70 input and $0.80 output per million tokens, it is among the more affordable options in this tier. Teams with specific quantitative or analytical workloads on a tight budget may find value here, but most business users will benefit from a more capable general-purpose model.

Assessed June 6, 2026

Editorial notes

DeepSeek R1 Distill Llama 70B is a cost-effective reasoning distillation with strong math performance, priced under $1/1M tokens, but limited agentic and instruction-following capability constrains its business utility.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence2.9Technical1.6Value7.3Content2.2
Intelligence 2.9/10
Technical 1.6/10
Content 2.2/10
Value 7.3/10

How DeepSeek: R1 Distill Llama 70B compares

DeepSeek: R1 Distill Llama 70B ranks #215 of 378 AI models we track for overall intelligence, #219 of 315 for coding, #246 of 289 for agentic tasks. Its 128K-token context window is larger than 43% of the models we list. At $0.80 per million input tokens it is cheaper than 27% of comparable models.

About DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across..

70B Parameters

Architecture Detail

Instruct Typedeepseek-r1

Performance Indices

Source: Artificial Analysis

16 Intelligence Index
11.4 Coding Index
11.7 Agentic Index
53.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 40.2% Graduate-level scientific reasoning
HLE 6.1% Humanity's Last Exam
MMLU Pro 79.5% Multi-task language understanding
MATH 500 93.5% Mathematical problem-solving
AIME 67% Competition mathematics
AIME 2025 53.7% Competition mathematics (2025)
SciCode 31.3% Scientific computing

Technical

LiveCodeBench 26.6% Live coding evaluation
TerminalBench Hard 1.5% Agentic terminal tasks
τ²-Bench 21.9% Conversational agent benchmark

Content

IFBench 27.6% Instruction following
LCR 11% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does DeepSeek: R1 Distill Llama 70B stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID deepseek/deepseek-r1-distill-llama-70b
Providerdeepseek
Model FamilyDeepSeek
Release Date January 23, 2025
Context Length128,000 tokens
Max Completion8,192 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.80 $0.000800
Output $0.80 $0.000800

Live Performance

Live endpoint metrics, refreshed every 30 minutes.

690ms
Best Latency (TTFT)
60 tok/s
Best Throughput
0/1
Active Endpoints
Available via: Novita

Leaderboard Categories

Frequently asked questions about DeepSeek: R1 Distill Llama 70B

How much does DeepSeek: R1 Distill Llama 70B cost?

DeepSeek: R1 Distill Llama 70B costs $0.80 per million input tokens and $0.80 per million output tokens.

What is the context window of DeepSeek: R1 Distill Llama 70B?

DeepSeek: R1 Distill Llama 70B has a context window of 128,000 tokens (128K).

Is DeepSeek: R1 Distill Llama 70B good for coding?

On our coding benchmark index, DeepSeek: R1 Distill Llama 70B ranks #219 of 315 models, placing it in the broader range of the field for code generation and debugging.

Who created DeepSeek: R1 Distill Llama 70B?

DeepSeek: R1 Distill Llama 70B is developed by DeepSeek and was released on January 23, 2025.