NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

nvidia · Released Apr 8, 2025 Efficient
Intelligence #182 / 525
42.3 Our Score
Speed #208 / 244
42.2 tokens / sec
Input #360 / 525
$0.600 per 1M tokens
Output #343 / 525
$1.80 per 1M tokens
Context #185 / 525
131,072 tokens

Analysis Summary

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 sits in the Efficient tier on our leaderboard, ranked #182 of 525 published models on overall intelligence. At $0.600 input and $1.80 output per 1M tokens, it is among the most expensive on the market. It offers a standard large context window and supports reasoning.

Editorial notes

NVIDIA's Llama 3.1 Nemotron Ultra 253B shows strong maths and MMLU scores but its composite intelligence and agentic indices are notably low, suggesting uneven capability across real-world tasks — at $0.60/$1.80 per million tokens it offers limited value relative to more balanced alternatives.

Assessed April 23, 2026

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence3.6Technical2.1Value7.3Content4.5
Intelligence 3.6/10
Technical 2.1/10
Content 4.5/10
Value 7.3/10

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural..

253B Parameters

Performance Indices

Source: Artificial Analysis

15 Intelligence Index
13.1 Coding Index
6.9 Agentic Index
63.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 72.8% Graduate-level scientific reasoning
HLE 8.1% Humanity's Last Exam
MMLU Pro 82.5% Multi-task language understanding
MATH 500 95.2% Mathematical problem-solving
AIME 74.7% Competition mathematics
AIME 2025 63.7% Competition mathematics (2025)
SciCode 34.7% Scientific computing

Technical

LiveCodeBench 64.1% Live coding evaluation
TerminalBench Hard 2.3% Agentic terminal tasks
τ²-Bench 11.4% Conversational agent benchmark

Content

IFBench 38.2% Instruction following
LCR 7.3% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 stack up?

Compare side-by-side with other efficient models.

Compare Models

Model Information

OpenRouter ID nvidia/llama-3.1-nemotron-ultra-253b-v1
Providernvidia
Release Date April 8, 2025
Context Length131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.60 $0.000600
Output $1.80 $0.001800