LocalOps
Back to Calculator

Nemotron 3 Super 120B

Hot

NVIDIA's hybrid Mamba-Transformer MoE with 1M context, optimized for collaborative agents and high-volume enterprise workflows

Model Specifications

ArchitectureTEXT
Parameters120B
Familynemotron
VRAM (Q4)60.0GB
Mixture of ExpertsActive inference parameters: 12B.
Hybrid LatentMoE with Mamba-2 and MTP layers. Supports 1M context
#nvidia#moe#agentic#mamba#trendingSource

Estimated Quantization Sizes

FormatPrecisionEst. VRAMRecommendation
FP16 / BF1616-bit240.0 GBUncompressed Base
Q8_0High8-bit120.0 GBNear Lossless
Q6_K6-bit90.0 GBExcellent Balance
Q4_K_MPopular4-bit60.0 GBStandard Use

Share this Model

Send this model's specs directly to your community.

Post