NVIDIA

A technology company pioneering accelerated computing and AI, known for developing GPUs and AI platforms including the Nemotron series of large language models.

Visit NVIDIA

NVIDIA Model Families

Filter

Nemotron 3 Nano

NVIDIA

Nemotron 3 Nano is a 31.6B parameter hybrid large language model developed by NVIDIA, released on December 15, 2025. It employs a novel Hybrid Mamba-Transformer Mixture-of-Experts (MoE) architecture, activating only ~3.2B parameters per token for efficient inference. Optimized for agentic AI, reasoning, and coding, it supports a 1 million token context window and features a 'Reasoning ON/OFF' toggle with a configurable thinking budget. NVIDIA has released the model with open weights, training data, and training recipes under the NVIDIA Open Model License.

Typetext

Parameters31.6B (Total), ~3.2B (Active)

2025-12-15

Open Weights

Details Compare

Nemotron 3 Super

NVIDIA

Nemotron 3 Super is a high-accuracy reasoning model designed for multi-agent systems, featuring a Hybrid Latent Mixture-of-Experts (MoE) Mamba-Transformer architecture. Announced by NVIDIA in December 2025 with availability expected in H1 2026, it is optimized for collaborative agent workflows and high-volume tasks like IT ticket automation. The model utilizes approximately 100 billion total parameters with 10 billion active per token, leveraging NVIDIA's NVFP4 training format and Multi-Token Prediction (MTP) for efficiency. It supports a 1 million token context window and is released under the NVIDIA Open Model License.

Typetext

Parameters100B (10B active)

2025-12-15

Open Weights

Details Compare

Nemotron 3 Ultra

NVIDIA

Nemotron 3 Ultra is the flagship large language model in NVIDIA's Nemotron 3 family, announced on December 15, 2025, with general availability expected in the first half of 2026. It features a massive 500 billion total parameters (with approximately 50 billion active parameters per token) and utilizes a novel Hybrid Mamba-Transformer Latent Mixture-of-Experts (MoE) architecture. This architecture, combining Mamba-2 state-space models with Transformer attention layers and Latent MoE routing, is designed for high efficiency and performance in complex, long-horizon tasks. With a 1 million token context window, Nemotron 3 Ultra is optimized for demanding enterprise AI applications such as deep research, strategic planning, and large-scale multi-agent coordination. It was trained using NVFP4 precision on NVIDIA Blackwell GPUs on a 3 trillion token dataset and incorporates Reinforcement Learning for agentic tasks. The model supports advanced reasoning, tool use, and is multilingual, with open weights released under the NVIDIA Open Model License.

Typetext

Parameters~500B (Total), ~50B (Active)

2025-12-15

Open Weights

Details Compare