Back to CalculatorDeploy Now
Magistral Small
HotMistral AI's first open-weight reasoning model — 24B parameters under Apache 2.0. Chain-of-thought reasoning in 20+ languages, 70.7% on AIME2024. Fits on a single RTX 4090 or 32GB MacBook.
Specifications
SourceArchitectureTEXT
Parameters24B
Familymagistral
VRAM (Q4)12.0G
128K context window but performance may degrade past 40K tokens. Excellent for domain-specific reasoning tasks.
mistralreasoningapache2multilingualmathtrending
Build your Local Rig
Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.
Instant Cloud GPUs
Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.
Quantization Estimates
| Format | VRAM Need | Tier |
|---|---|---|
| FP16 | 48.0 GB | Full Precision |
| Q8_0 | 24.0 GB | High |
| Q6_K | 20.4 GB | Excellent |
| Q5_K_M | 16.8 GB | Great |
| Q4_K_M | 12.0 GB | Sweet Spot |
| Q2_K | 7.2 GB | Emergency |
Share this Model
Send these specs directly to your community.
Similar Models
Magistral Medium
0BEnterprise-tier version of Magistral reasoning family. More capable than Magistral Small for complex multi-step tasks in math, law, finance, and code. Available via Mistral API and Le Chat; not open-weight.
Llama 3.1 8B
8.03BBest small model for most tasks
Qwen 2.5 72B
72.7BTop-tier reasoning and coding