NVIDIA: Nemotron Nano 9B V2

NVIDIA: Nemotron Nano 9B V2

nvidia · Released Sep 5, 2025
40
Our Score

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

$0.04 / 1M Input Price
$0.16 / 1M Output Price
131,072 tokens Context Window
9B Parameters

Capabilities

Tool Use Function Calling

Architecture

ModalityText → Text
TokenizerOther
Parameters9B

Performance Indices

Source: Artificial Analysis

13.2 Intelligence Index
7.5 Coding Index
12.1 Agentic Index
62.3 Math Index

Benchmark Scores

Evaluations

GPQA Diamond 55.7%
Graduate-level scientific reasoning
HLE 4%
Humanity's Last Exam
MMLU Pro 73.9%
Multi-task language understanding
LiveCodeBench 70.1%
Live coding evaluation
SciCode 20.9%
Scientific computing
AIME 2025 62.3%
Competition mathematics (2025)
IFBench 27.1%
Instruction following
LCR 22.7%
Long-context reasoning
TerminalBench Hard 0.8%
Agentic terminal tasks
τ²-Bench 23.4%
Conversational agent benchmark

Benchmark data from Artificial Analysis and Hugging Face

Model Information

OpenRouter ID nvidia/nemotron-nano-9b-v2
Providernvidia
Release Date September 5, 2025
Context Length131,072 tokens
Status Active

Pricing

Token Type Cost per 1M tokens Cost per 1K tokens
Input $0.04 $0.000040
Output $0.16 $0.000160

Live Performance

Live endpoint metrics — refreshed every 30 minutes.

219ms
Best Latency (TTFT)
131 tok/s
Best Throughput
0/1
Active Endpoints
Available via: DeepInfra