LocalOps LogoLocalOps
Back to Calculator

Nemotron 3 Super 120B

Hot

NVIDIA's hybrid Mamba-Transformer MoE with 1M context, optimized for collaborative agents and high-volume enterprise workflows

Specifications

Source
ArchitectureTEXT
Parameters120B
Familynemotron
VRAM (Q4)60.0G
MoE: 12B active.
Hybrid LatentMoE with Mamba-2 and MTP layers. Supports 1M context
nvidiamoeagenticmambatrending

Build your Local Rig

Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.

Instant Cloud GPUs

Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.

Deploy Now

Quantization Estimates

FormatVRAM NeedTier
FP16240.0 GBFull Precision
Q8_0120.0 GBHigh
Q6_K102.0 GBExcellent
Q5_K_M84.0 GBGreat
Q4_K_M60.0 GBSweet Spot
Q2_K36.0 GBEmergency

Share this Model

Send these specs directly to your community.

Post