NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
Analysis Summary
NVIDIA Llama 3.1 Nemotron Ultra 253B v1 is a large open-weight model fine-tuned by NVIDIA for reasoning and math tasks. Its math index of 63.7 and livecodebench score of 0.641 are competitive, and MMLU Pro at 0.825 reflects broad knowledge coverage. However, the intelligence index is very low and the agentic index is minimal, suggesting the model is not well suited to multi-step tool use or autonomous workflows.
For businesses, it may serve niche use cases in mathematical reasoning or structured knowledge retrieval, but the very low agentic capability and poor terminal benchmark scores make it unsuitable for most modern agentic or coding pipelines. The 131K context window is adequate but not exceptional. No tool use or function calling is listed, which further limits its versatility.
At $0.60 input and $1.80 output per million tokens, pricing is moderate. Teams with specific math-heavy or knowledge-retrieval needs may find value here, but general business workflows will be better served by models with stronger agentic and reasoning profiles.
Assessed June 17, 2026
Editorial notes
NVIDIA Llama 3.1 Nemotron Ultra 253B v1 has a strong math index and competitive livecodebench score, but its intelligence and agentic indices are very low, limiting its general business utility.
Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?
Performance Profile
How NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 compares
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 ranks #238 of 382 AI models we track for overall intelligence, #281 of 293 for agentic tasks. Its 131K-token context window is larger than 59% of the models we list. At $0.60 per million input tokens it is cheaper than 33% of comparable models.
About NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Metaās Llama-3.1-405B-Instruct, it has been significantly customized using Neural..
Performance Indices
Source: Artificial Analysis
Benchmark Scores
Intelligence
Technical
Content
Benchmark data from Artificial Analysis and Hugging Face
How does NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 stack up?
Compare side-by-side with other efficient models.
Model Information
| OpenRouter ID |
nvidia/llama-3.1-nemotron-ultra-253b-v1
|
| Provider | nvidia |
| Release Date | April 8, 2025 |
| Context Length | 131,072 tokens |
| Status | Active |
Pricing
| Token Type | Cost per 1M tokens | Cost per 1K tokens |
|---|---|---|
| Input | $0.60 | $0.000600 |
| Output | $1.80 | $0.001800 |
External Resources
Explore Related Models
Frequently asked questions about NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
How much does NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 cost?
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 costs $0.60 per million input tokens and $1.80 per million output tokens.
What is the context window of NVIDIA: Llama 3.1 Nemotron Ultra 253B v1?
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 has a context window of 131,072 tokens (131K).
Who created NVIDIA: Llama 3.1 Nemotron Ultra 253B v1?
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 is developed by NVIDIA and was released on April 8, 2025.
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: June 27, 2026 9:41 am