NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

nvidia · Released Oct 10, 2025 Professional
Intelligence #10 / 576
82.0 Our Score
Speed #174 / 271
67.6 tokens / sec
Input #356 / 577
$0.400 per 1M tokens
Output #226 / 577
$0.400 per 1M tokens
Context #233 / 577
131,072 tokens

Analysis Summary

NVIDIA's Llama 3.3 Nemotron Super 49B V1.5 is a mid-tier open-ecosystem model with an intelligence index of 18.5 and a coding index of just 9.4. Its agentic index of 26.9 is more competitive, and MMLU Pro at 0.785 shows reasonable general knowledge. Tool use and function calling are supported.

For businesses, the weak coding score limits its use in software engineering or technical automation. Long-context reliability is very low at 0.17, which restricts its usefulness for document-heavy tasks. It is better suited to general Q&A, moderate reasoning tasks, and lightweight agentic workflows where coding is not required.

Flat pricing at $0.40 per million tokens for both input and output is straightforward and cost-effective for the capability level. Teams looking for a budget-friendly general assistant with basic tool use may find it adequate, but most business workloads will be better served by models with stronger coding and context handling.

Assessed June 9, 2026

Editorial notes

NVIDIA Llama 3.3 Nemotron Super 49B V1.5 offers moderate reasoning and agentic capability at a flat $0.40/M rate, but coding performance is weak and long-context reliability is low.

Rankings consider pricing, capabilities, benchmarks, and real-world applicability and are refreshed as new models launch. Feedback?

Performance Profile

Intelligence3.4Technical2.6Value7.8Content3.1
Intelligence 3.4/10
Technical 2.6/10
Content 3.1/10
Value 7.8/10

How NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 compares

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 ranks #191 of 378 AI models we track for overall intelligence, #248 of 315 for coding, #150 of 289 for agentic tasks. Its 131K-token context window is larger than 60% of the models we list. At $0.40 per million input tokens it is cheaper than 38% of comparable models.

About NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and..

49B Parameters

Capabilities

Tool Use Function Calling

Performance Indices

Source: Artificial Analysis

18.5 Intelligence Index
9.4 Coding Index
26.9 Agentic Index
54.7 Math Index

Benchmark Scores

Intelligence

GPQA Diamond 64.3% Graduate-level scientific reasoning
HLE 6.5% Humanity's Last Exam
MMLU Pro 78.5% Multi-task language understanding
MATH 500 95.9% Mathematical problem-solving
AIME 58.3% Competition mathematics
AIME 2025 54.7% Competition mathematics (2025)
SciCode 28.2% Scientific computing

Technical

LiveCodeBench 27.7% Live coding evaluation
τ²-Bench 26.9% Conversational agent benchmark

Content

IFBench 38.1% Instruction following
LCR 17% Long-context reasoning

Benchmark data from Artificial Analysis and Hugging Face

How does NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 stack up?

Compare side-by-side with other professional models.

Compare Models

Model Information

OpenRouter ID nvidia/llama-3.3-nemotron-super-49b-v1.5
Providernvidia
Release Date October 10, 2025
Context Length131,072 tokens
Max Completion16,384 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.40 $0.000400
Output $0.40 $0.000400

Live Performance

Live endpoint metrics, refreshed every 30 minutes.

264ms
Best Latency (TTFT)
42.5 tok/s
Best Throughput
0/1
Active Endpoints
Available via: DeepInfra

Leaderboard Categories

Frequently asked questions about NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

How much does NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 cost?

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 costs $0.40 per million input tokens and $0.40 per million output tokens.

What is the context window of NVIDIA: Llama 3.3 Nemotron Super 49B V1.5?

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 has a context window of 131,072 tokens (131K).

Is NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 good for coding?

On our coding benchmark index, NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 ranks #248 of 315 models, placing it in the broader range of the field for code generation and debugging.

What can NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 do?

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 supports tool use and function calling.

Who created NVIDIA: Llama 3.3 Nemotron Super 49B V1.5?

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 is developed by NVIDIA and was released on October 10, 2025.