Home > AI Models > NVIDIA: Nemotron 3 Ultra

NVIDIA: Nemotron 3 Ultra

Name: NVIDIA: Nemotron 3 Ultra Review
Item: NVIDIA: Nemotron 3 Ultra
Author: Design for Online Editorial

NEWKimi K3in at #9 NEWKAT-Coder-Air V2.5in at #560 NEWKAT-Coder-Pro V2.5in at #568 NEWMuse Spark 1.1in at #392 NEWUncensoredin at #487 NEWGPT-5.6 Terrain at #11 NEWGPT-5.6 Sol Proin at #416 NEWGPT-5.6 Solin at #2

NVIDIA: Nemotron 3 Ultra

nvidia · Released Jun 4, 2026

Intelligence #9 / 612

82.0 our score

Speed #35 / 287

195.3 tok/s

Input Price #406 / 612

$0.600 per 1M tokens

Output Price #462 / 612

$3.60 per 1M tokens

Context #63 / 612

1M tokens

NVIDIA Nemotron 3 Ultra is a strong mid-tier model with a 37.8 intelligence index, a 49.3 coding index approaching world-class territory, and a 59.8 agentic index that places it comfortably in the upper tier for autonomous task execution. It supports tool use and function calling across a 1M token context window, making it well suited to long-document analysis and multi-step agentic pipelines.

For businesses, the combination of near-excellent coding capability, strong agentic reliability, and a 1M context window makes it a practical choice for software engineering workflows, code review automation, and document-heavy analysis tasks. The absence of vision support is a limitation for teams needing multimodal processing.

At $0.50 input and $2.20 output per million tokens, it offers strong price-performance for technical workloads. Teams running high-volume coding or agentic tasks will find it a cost-effective alternative to premium flagship models, particularly where vision is not required.

Assessed July 10, 2026

Editorial notes

NVIDIA Nemotron 3 Ultra delivers strong coding and agentic performance with a 1M token context, tool use, and competitive pricing at $0.50 input, making it a high-value option for technical business workflows.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

DFO Verdict

#9 of 612 overall

Benchmark scores

GPQA Diamond 86.7%

HLE 26.6%

SciCode 39.9%

TerminalBench Hard 36.4%

τ²-Bench 83.3%

IFBench 81.4%

LCR 67%

Magenta = intelligence · Ink = technical/agentic · Cyan = content & long-context · Grey = community benchmarks. Data: Artificial Analysis, Hugging Face.

37.8 Intelligence Index·49.3 Coding Index·59.8 Agentic Index

How NVIDIA: Nemotron 3 Ultra compares

NVIDIA: Nemotron 3 Ultra ranks #46 of 393 AI models we track for overall intelligence, #39 of 157 for coding, #53 of 300 for agentic tasks. Its 1M-token context window is larger than 90% of the models we list. At $0.60 per million input tokens it is cheaper than 34% of comparable models.

Position in the field

Intelligence: smarter than 99% of models #9

Speed: faster than 88% of models #35

Price: cheaper than 34% of models #406

Context: larger than 90% of models #63

worst in fieldmedianbest in field

Price vs frontier peers · $ per 1M tokens

NVIDIA: Nemotron 3 Ultra $0.60 in $3.60 out

Anthropic: Claude Fable 5 $10.00 in $50.00 out

Anthropic: Claude Opus 4.8 $5.00 in $25.00 out

Google: Gemini 3.1 Pro Preview $2.00 in $12.00 out

Dark bar = input · light bar = output, scaled to the priciest peer.

Context window vs peers · tokens

Google: Gemini 3.1 Pro Preview 1M

NVIDIA: Nemotron 3 Ultra 1M

Anthropic: Claude Fable 5 1M

Anthropic: Claude Opus 4.8 1M

1M tokens ≈ 8 full-length novels or ~2,500 pages of business documents in a single request.

Performance profile

Strongest on content. The pulled-in intelligence corner is the trade-off, and if the shape matters more than the price, this is your model.

Compare shapes side-by-side →

Pricing

Token Type	Cost per 1M tokens	Cost per 1K tokens
Input	$0.60	$0.000600
Output	$3.60	$0.003600

What would NVIDIA: Nemotron 3 Ultra cost your business?

Pick the job that looks most like yours, then fine-tune with the sliders. Estimates update live.

A website chatbot handling around 100 customer conversations a day, a few short messages each.

Requests per month 3,000

One request is one message, email, draft or automation call.

Size of each request 1,200 tokens

$0/mo NVIDIA: Nemotron 3 Ultra

$0/mo Anthropic: Claude Fable 5

$0/mo Z.ai: GLM 5.2 · best value

Full calculator with 612 models → Price Calculator

DFO AI AUTOMATION

These numbers get smaller with the right architecture.

We route routine calls to cheap models and save NVIDIA: Nemotron 3 Ultra for the hard ones. Most clients cut their estimate by 60-80%.

Talk to our team

About NVIDIA: Nemotron 3 Ultra

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it..

Embed this ranking

Writing about this model? Add the badge to your site. It always shows the current rank and score, and links back to this page.

NVIDIA: Nemotron 3 Ultra rank badge, Dark

<a href="https://designforonline.com/ai-models/nvidia-nemotron-3-ultra/"><img src="https://designforonline.com/?aiml_badge=nvidia-nemotron-3-ultra&theme=dark" alt="NVIDIA: Nemotron 3 Ultra, ranked #9 on the Design for Online AI Leaderboard" width="400" height="76"></a>

NVIDIA: Nemotron 3 Ultra rank badge, Light

<a href="https://designforonline.com/ai-models/nvidia-nemotron-3-ultra/"><img src="https://designforonline.com/?aiml_badge=nvidia-nemotron-3-ultra&theme=light" alt="NVIDIA: Nemotron 3 Ultra, ranked #9 on the Design for Online AI Leaderboard" width="400" height="76"></a>

Frequently asked questions about NVIDIA: Nemotron 3 Ultra

How much does NVIDIA: Nemotron 3 Ultra cost?

NVIDIA: Nemotron 3 Ultra costs $0.60 per million input tokens and $3.60 per million output tokens.

What is the context window of NVIDIA: Nemotron 3 Ultra?

NVIDIA: Nemotron 3 Ultra has a context window of 1,000,000 tokens (1M).

Is NVIDIA: Nemotron 3 Ultra good for coding?

On our coding benchmark index, NVIDIA: Nemotron 3 Ultra ranks #39 of 157 models, placing it in the top quartile of the field for code generation and debugging.

What can NVIDIA: Nemotron 3 Ultra do?

NVIDIA: Nemotron 3 Ultra supports tool use and function calling.

Who created NVIDIA: Nemotron 3 Ultra?

NVIDIA: Nemotron 3 Ultra is developed by NVIDIA and was released on June 4, 2026.

Performance profile

Intelligence 5.8

Technical 6.3

Content 8.5

Value 7.3

Reasoning: Yes
Input
Output
Context: 1M tokens
Tokenizer: Other
Released: Jun 4, 2026

Modality data from OpenRouter; may understate provider-native audio/video/image output.

Model information

Provider nvidia

OpenRouter ID nvidia/nemotron-3-ultra-550b-a55b

Status Active

Capabilities

Tool Use Function Calling

Ranked in

AI Agents Coding General Tool Use

Live performance · 30 min refresh

99.5% Avg uptime

455ms Best latency

129 tok/s Best throughput

3/4 Active endpoints

External resources View on OpenRouter API access, playground & provider details API Quickstart Sample code and integration guide

Data sourced from the OpenRouter API, Artificial Analysis, the Hugging Face Open LLM Leaderboard and our own internal testing. Scores are editorially curated by our team.

Last updated: July 19, 2026 8:38 pm

Issues with our rankings? Contact us

NVIDIA: Nemotron 3 Ultra

DFO Verdict

Benchmark scores

How NVIDIA: Nemotron 3 Ultra compares

Pricing

What would NVIDIA: Nemotron 3 Ultra cost your business?

About NVIDIA: Nemotron 3 Ultra

Explore Related Models

Embed this ranking

Frequently asked questions about NVIDIA: Nemotron 3 Ultra

How much does NVIDIA: Nemotron 3 Ultra cost?

What is the context window of NVIDIA: Nemotron 3 Ultra?

Is NVIDIA: Nemotron 3 Ultra good for coding?

What can NVIDIA: Nemotron 3 Ultra do?

Who created NVIDIA: Nemotron 3 Ultra?