Back to CalculatorDeploy Now
Nemotron 3 Nano 30B
HotNVIDIA's efficient hybrid Mamba-Transformer MoE for agentic reasoning, 4x faster throughput than Nemotron 2 Nano
Specifications
SourceArchitectureTEXT
Parameters31.6B
Familynemotron
VRAM (Q4)15.8G
MoE: 3.2B active.
Hybrid MoE with Mamba-2. Supports 1M context.
nvidiamoeagenticmambaefficienttrending
Build your Local Rig
Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.
Instant Cloud GPUs
Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.
Quantization Estimates
| Format | VRAM Need | Tier |
|---|---|---|
| FP16 | 63.2 GB | Full Precision |
| Q8_0 | 31.6 GB | High |
| Q6_K | 26.9 GB | Excellent |
| Q5_K_M | 22.1 GB | Great |
| Q4_K_M | 15.8 GB | Sweet Spot |
| Q2_K | 9.5 GB | Emergency |
Share this Model
Send these specs directly to your community.