LocalOps
Back to Calculator

Nemotron 3 Nano 30B

Hot

NVIDIA's efficient hybrid Mamba-Transformer MoE for agentic reasoning, 4x faster throughput than Nemotron 2 Nano

Model Specifications

ArchitectureTEXT
Parameters31.6B
Familynemotron
VRAM (Q4)15.8GB
Mixture of ExpertsActive inference parameters: 3.2B.
Hybrid MoE with Mamba-2. Supports 1M context.
#nvidia#moe#agentic#mamba#efficient#trendingSource

Estimated Quantization Sizes

FormatPrecisionEst. VRAMRecommendation
FP16 / BF1616-bit63.2 GBUncompressed Base
Q8_0High8-bit31.6 GBNear Lossless
Q6_K6-bit23.7 GBExcellent Balance
Q4_K_MPopular4-bit15.8 GBStandard Use

Share this Model

Send this model's specs directly to your community.

Post