NVIDIA: Nemotron 3 Ultra
Analysis Summary
NVIDIA Nemotron 3 Ultra is a strong mid-tier model with a coding index approaching the top tier and an agentic index that competes with models well above its price point. It supports tool use, function calling, and a 1M token context window, making it a capable choice for agentic pipelines and long-context workloads. Instruction following is a particular strength, with one of the higher ifbench scores in this batch.
For businesses, it is well-suited to software engineering assistance, multi-step tool-use workflows, and long-document processing. The lack of vision support is a limitation for teams needing multimodal capability, but for text-only agentic and coding tasks it punches above its price.
At $0.50 input and $2.20 output per million tokens, it offers strong price-performance for production use. Teams running high-volume agentic or coding workloads who do not need vision should consider it as a cost-efficient alternative to pricier flagship models.
Assessed June 30, 2026
Editorial notes
NVIDIA Nemotron 3 Ultra delivers strong coding and agentic performance with a 1M context window, tool use, and competitive pricing at $0.50 input per million tokens.
Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?
Performance Profile
How NVIDIA: Nemotron 3 Ultra compares
NVIDIA: Nemotron 3 Ultra ranks #42 of 385 AI models we track for overall intelligence, #34 of 129 for coding, #63 of 293 for agentic tasks. Its 1M-token context window is larger than 91% of the models we list. At $0.50 per million input tokens it is cheaper than 35% of comparable models.
About NVIDIA: Nemotron 3 Ultra
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it..
Capabilities
Performance Indices
Source: Artificial Analysis
Benchmark Scores
Intelligence
Technical
Content
Benchmark data from Artificial Analysis and Hugging Face
How does NVIDIA: Nemotron 3 Ultra stack up?
Compare side-by-side with other professional models.
Model Information
| OpenRouter ID |
nvidia/nemotron-3-ultra-550b-a55b
|
| Provider | nvidia |
| Release Date | June 4, 2026 |
| Context Length | 1,000,000 tokens |
| Max Completion | 16,384 tokens |
| Status | Active |
Pricing
| Token Type | Cost per 1M tokens | Cost per 1K tokens |
|---|---|---|
| Input | $0.50 | $0.000500 |
| Output | $2.20 | $0.002200 |
Live Performance
Live endpoint metrics, refreshed every 30 minutes.
External Resources
Explore Related Models
Frequently asked questions about NVIDIA: Nemotron 3 Ultra
How much does NVIDIA: Nemotron 3 Ultra cost?
NVIDIA: Nemotron 3 Ultra costs $0.50 per million input tokens and $2.20 per million output tokens.
What is the context window of NVIDIA: Nemotron 3 Ultra?
NVIDIA: Nemotron 3 Ultra has a context window of 1,000,000 tokens (1M).
Is NVIDIA: Nemotron 3 Ultra good for coding?
On our coding benchmark index, NVIDIA: Nemotron 3 Ultra ranks #34 of 129 models, placing it in the broader range of the field for code generation and debugging.
What can NVIDIA: Nemotron 3 Ultra do?
NVIDIA: Nemotron 3 Ultra supports tool use and function calling.
Who created NVIDIA: Nemotron 3 Ultra?
NVIDIA: Nemotron 3 Ultra is developed by NVIDIA and was released on June 4, 2026.
Data sourced from OpenRouter API, Artificial Analysis and Hugging Face Open LLM Leaderboard. Scores are editorially curated by our team.
Last updated: June 30, 2026 9:37 pm