NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIA: Llama 3.1 Nemotron 70B Instruct

nvidia · Released Oct 15, 2024
44
Our Score

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging Llama 3.1 70B architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to Meta's Acceptable Use Policy.

$1.20 / 1M Input Price
$1.20 / 1M Output Price
131,072 tokens Context Window
16,384 tokens Max Output
70B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerLlama3
Instruct Typellama3
Parameters70B

Performance Indices

Source: Artificial Analysis

13.4 Intelligence Index
10.8 Coding Index
13.8 Agentic Index
11 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 46.5%
Graduate-level scientific reasoning
HLE 4.6%
Humanity's Last Exam
MMLU Pro 69%
Multi-task language understanding
LiveCodeBench 16.9%
Live coding evaluation
SciCode 23.3%
Scientific computing
MATH 500 73.3%
Mathematical problem-solving
AIME 24.7%
Competition mathematics
AIME 2025 11%
Competition mathematics (2025)
IFBench 30.7%
Instruction following
LCR 7%
Long-context reasoning
TerminalBench Hard 4.5%
Agentic terminal tasks
τ²-Bench 23.1%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID nvidia/llama-3.1-nemotron-70b-instruct
Providernvidia
Release Date October 15, 2024
Context Length131,072 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $1.20 $0.001200
Output $1.20 $0.001200

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

100%
Avg Uptime
160ms
Best Latency (TTFT)
15 tok/s
Best Throughput
1/1
Active Endpoints
Available via: DeepInfra

Leaderboard Categories