Nemotron 3 Nano

Name: Nemotron 3 Nano
Author: NVIDIA

NVIDIAOpen WeightsPending Human Review

Nemotron 3 Nano is a 31.6B parameter hybrid large language model developed by NVIDIA, released on December 15, 2025. It employs a novel Hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture, activating only ~3.2B parameters per token for efficient inference. Optimized for agentic AI, reasoning, and coding, it supports a 1 million token context window and features a 'Reasoning ON/OFF' toggle with a configurable thinking budget. NVIDIA has released the model with open weights, training data, and training recipes under the NVIDIA Open Model License.

2025-12-15

31.6B (Total), ~3.2B (Active)

Hybrid Mamba-Transformer Mixture-of-Experts (MoE)

NVIDIA Open Model License

Compare with other models

Specifications

Parameters: 31.6B (Total), ~3.2B (Active)
Architecture: Hybrid Mamba-Transformer Mixture-of-Experts (MoE)
License: NVIDIA Open Model License
Context Window: 1,000,000 tokens
Max Output: 128,000 tokens
Training Data Cutoff: Nov 2025
Type: text
Modalities: text

Benchmark Scores

MMLU78.56

Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...

MMLU-Pro78.3

MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...

GSM8K92.34

Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....

MATH82.88

A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....

MATH 50078.63

A sample of 500 diverse problems from the MATH benchmark, spanning topics like probability, algebra,...

HumanEval78.05

Evaluates code generation capabilities by asking models to complete Python functions based on docstr...

ARC91.89

AI2 Reasoning Challenge (ARC) tests reasoning through grade-school science questions....

HellaSwag85.56

Tests common sense natural language inference through completion of scenarios....

Global-MMLU-Lite74.47

A balanced collection of culturally sensitive and culturally agnostic MMLU tasks designed for effici...

MGSM83

Multilingual Grade School Math (MGSM) extends GSM8K to 10 languages....

AIME-202599.2

American Invitational Mathematics Examination (AIME) 2025 problems....

GPQA75

Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...

Humanitys-Last-Exam15.5

A challenging benchmark of novel problems designed to test the limits of AI capabilities....

SWE-bench38.8

Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...

TAU-bench49

Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....

Terminal-bench8.5

Evaluates models on their ability to use terminal commands to solve system administration tasks....

Berkeley Function-Calling Leaderboard53.8

The first comprehensive evaluation of LLMs' function calling capabilities, testing various forms inc...

Scale-MultiChallenge38.5

A multi-domain challenge set created by Scale AI to test models across diverse tasks....

Advanced Specifications

Model Family: Nemotron 3
Finetuned From: NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
API Access: Available
Chat Interface: Available
Multilingual Support: Yes
Variants: NVIDIA-Nemotron-3-Nano-30B-A3B-BF16NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16NVIDIA-Nemotron-3-Nano-30B-A3B-FP8GGUF (community)GPTQ (community)
Hardware Support: CUDANVIDIA RTXNVIDIA H100NVIDIA B200

Capabilities & Limitations

Capabilities: agentic AIreasoningcodemathfunction callingtool useRAG
Known Limitations: Reduced accuracy on complex tasks if reasoning traces are disabled
Notable Use Cases: multi-agent systemscoding assistantdocument QAlong-context reasoning
Function Calling Support: Yes
Tool Use Support: Yes

Resources

Related Models

Nemotron 3 Super

NVIDIA

Nemotron 3 Super is a high-accuracy reasoning model designed for multi-agent systems, featuring a Hybrid Latent Mixture-of-Experts (MoE) Mamba-Transformer architecture. Announced by NVIDIA in December 2025 with availability expected in H1 2026, it is optimized for collaborative agent workflows and high-volume tasks like IT ticket automation. The model utilizes approximately 100 billion total parameters with 10 billion active per token, leveraging NVIDIA's NVFP4 training format and Multi-Token Prediction (MTP) for efficiency. It supports a 1 million token context window and is released under the NVIDIA Open Model License.

Typetext

Parameters100B (10B active)

2025-12-15

Open Weights

Details Compare

Nemotron 3 Ultra

NVIDIA

Nemotron 3 Ultra is the flagship large language model in NVIDIA's Nemotron 3 family, announced on December 15, 2025, with general availability expected in the first half of 2026. It features a massive 500 billion total parameters (with approximately 50 billion active parameters per token) and utilizes a novel Hybrid Mamba-Transformer Latent Mixture-of-Experts (MoE) architecture. This architecture, combining Mamba-2 state-space models with Transformer attention layers and Latent MoE routing, is designed for high efficiency and performance in complex, long-horizon tasks. With a 1 million token context window, Nemotron 3 Ultra is optimized for demanding enterprise AI applications such as deep research, strategic planning, and large-scale multi-agent coordination. It was trained using NVFP4 precision on NVIDIA Blackwell GPUs on a 3 trillion token dataset and incorporates Reinforcement Learning for agentic tasks. The model supports advanced reasoning, tool use, and is multilingual, with open weights released under the NVIDIA Open Model License.

Typetext

Parameters~500B (Total), ~50B (Active)

2025-12-15

Open Weights

Details Compare