Nemotron 3 Nano
Nemotron 3 Nano is a 31.6B parameter hybrid large language model developed by NVIDIA, released on December 15, 2025. It employs a novel Hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture, activating only ~3.2B parameters per token for efficient inference. Optimized for agentic AI, reasoning, and coding, it supports a 1 million token context window and features a 'Reasoning ON/OFF' toggle with a configurable thinking budget. NVIDIA has released the model with open weights, training data, and training recipes under the NVIDIA Open Model License.
Specifications
- Parameters
- 31.6B (Total), ~3.2B (Active)
- Architecture
- Hybrid Mamba-Transformer Mixture-of-Experts (MoE)
- License
- NVIDIA Open Model License
- Context Window
- 1,000,000 tokens
- Max Output
- 128,000 tokens
- Training Data Cutoff
- Nov 2025
- Type
- text
- Modalities
- text
Benchmark Scores
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including mathema...
MMLU-Pro is an enhanced benchmark with over 12,000 challenging questions across 14 domains including...
Grade School Math 8K (GSM8K) consists of 8.5K high-quality grade school math word problems....
A dataset of 12,500 challenging competition mathematics problems requiring multi-step reasoning....
A sample of 500 diverse problems from the MATH benchmark, spanning topics like probability, algebra,...
Evaluates code generation capabilities by asking models to complete Python functions based on docstr...
A balanced collection of culturally sensitive and culturally agnostic MMLU tasks designed for effici...
Graduate-level Problems in Quantitative Analysis (GPQA) evaluates advanced reasoning on graduate-lev...
A challenging benchmark of novel problems designed to test the limits of AI capabilities....
Software Engineering Benchmark (SWE-bench) evaluates models on real-world software engineering tasks...
Tool Augmented Understanding Benchmark (TAU-bench) evaluates models on their ability to use tools....
Evaluates models on their ability to use terminal commands to solve system administration tasks....
The first comprehensive evaluation of LLMs' function calling capabilities, testing various forms inc...
A multi-domain challenge set created by Scale AI to test models across diverse tasks....
Advanced Specifications
- Model Family
- Nemotron 3
- Finetuned From
- NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
- API Access
- Available
- Chat Interface
- Available
- Multilingual Support
- Yes
- Variants
- NVIDIA-Nemotron-3-Nano-30B-A3B-BF16NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16NVIDIA-Nemotron-3-Nano-30B-A3B-FP8GGUF (community)GPTQ (community)
- Hardware Support
- CUDANVIDIA RTXNVIDIA H100NVIDIA B200
Capabilities & Limitations
- Capabilities
- agentic AIreasoningcodemathfunction callingtool useRAG
- Known Limitations
- Reduced accuracy on complex tasks if reasoning traces are disabled
- Notable Use Cases
- multi-agent systemscoding assistantdocument QAlong-context reasoning
- Function Calling Support
- Yes
- Tool Use Support
- Yes