LocalOps LogoLocalOps
Back to Calculator

Llama 4 Maverick

High-efficiency MoE, 128 experts, 1M context

Specifications

Source
ArchitectureTEXT
Parameters400B
Familyllama4
VRAM (Q4)200.0G
MoE: 17B active.
Outperforms GPT-4.5, optimized for DGX systems
chatmetaopen-weightsmultimodal

Run in the Cloud

This model requires enterprise-grade VRAM. Rent GPUs on RunPod and start generating.

Deploy on RunPod

Instant Cloud GPUs

Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.

Deploy Now

Quantization Estimates

FormatVRAM NeedTier
FP16800.0 GBFull Precision
Q8_0400.0 GBHigh
Q6_K340.0 GBExcellent
Q5_K_M280.0 GBGreat
Q4_K_M200.0 GBSweet Spot
Q2_K120.0 GBEmergency

Share this Model

Send these specs directly to your community.

Post