Back to Calculator Deploy on RunPodDeploy Now
Llama 4 Maverick
High-efficiency MoE, 128 experts, 1M context
Specifications
SourceArchitectureTEXT
Parameters400B
Familyllama4
VRAM (Q4)200.0G
MoE: 17B active.
Outperforms GPT-4.5, optimized for DGX systems
chatmetaopen-weightsmultimodal
Run in the Cloud
This model requires enterprise-grade VRAM. Rent GPUs on RunPod and start generating.
Instant Cloud GPUs
Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.
Quantization Estimates
| Format | VRAM Need | Tier |
|---|---|---|
| FP16 | 800.0 GB | Full Precision |
| Q8_0 | 400.0 GB | High |
| Q6_K | 340.0 GB | Excellent |
| Q5_K_M | 280.0 GB | Great |
| Q4_K_M | 200.0 GB | Sweet Spot |
| Q2_K | 120.0 GB | Emergency |
Share this Model
Send these specs directly to your community.