LocalOps LogoLocalOps
Back to Calculator

Nemotron 3 Nano 30B

Hot

NVIDIA's efficient hybrid Mamba-Transformer MoE for agentic reasoning, 4x faster throughput than Nemotron 2 Nano

Specifications

Source
ArchitectureTEXT
Parameters31.6B
Familynemotron
VRAM (Q4)15.8G
MoE: 3.2B active.
Hybrid MoE with Mamba-2. Supports 1M context.
nvidiamoeagenticmambaefficienttrending

Build your Local Rig

Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.

Instant Cloud GPUs

Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.

Deploy Now

Quantization Estimates

FormatVRAM NeedTier
FP1663.2 GBFull Precision
Q8_031.6 GBHigh
Q6_K26.9 GBExcellent
Q5_K_M22.1 GBGreat
Q4_K_M15.8 GBSweet Spot
Q2_K9.5 GBEmergency

Share this Model

Send these specs directly to your community.

Post