Back to Calculator
Magistral Small
HotMistral AI's first open-weight reasoning model — 24B parameters under Apache 2.0. Chain-of-thought reasoning in 20+ languages, 70.7% on AIME2024. Fits on a single RTX 4090 or 32GB MacBook.
Model Specifications
ArchitectureTEXT
Parameters24B
Familymagistral
VRAM (Q4)12.0GB
128K context window but performance may degrade past 40K tokens. Excellent for domain-specific reasoning tasks.
Estimated Quantization Sizes
| Format | Precision | Est. VRAM | Recommendation |
|---|---|---|---|
| FP16 / BF16 | 16-bit | 48.0 GB | Uncompressed Base |
| Q8_0High | 8-bit | 24.0 GB | Near Lossless |
| Q6_K | 6-bit | 18.0 GB | Excellent Balance |
| Q4_K_MPopular | 4-bit | 12.0 GB | Standard Use |
Share this Model
Send this model's specs directly to your community.
Similar Models
Magistral Medium
0BEnterprise-tier version of Magistral reasoning family. More capable than Magistral Small for complex multi-step tasks in math, law, finance, and code. Available via Mistral API and Le Chat; not open-weight.
Llama 3.3 70B
70.55BRefined Llama 3 with superior following
Llama 3.2 3B
3.21BMobile-optimized small model
Related Guides
How much VRAM do you really need?
A complete breakdown of quantization levels and VRAM overhead for running local models.
Best GPUs for Machine Learning in 2026
Comparing NVIDIA and AMD options for the best speed-to-dollar ratio.
GGUF vs EXL2 vs AWQ
Understanding local AI formats and which one to pick for your specific hardware.