LocalOps LogoLocalOps
Back to Calculator

GPT OSS 20B

Hot

OpenAI's compact open-weight MoE reasoning model, matches o3-mini on benchmarks, runs in 16GB VRAM. First major open-weight release from OpenAI.

Specifications

Source
ArchitectureTEXT
Parameters21B
Familygpt-oss
VRAM (Q4)16G
MoE: 3.6B active.
Apache 2.0. MXFP4 quantized. Configurable reasoning effort (low/medium/high). 128K context. Runs on 16GB GPU.
openaireasoningmoeefficientapache2trending

Build your Local Rig

Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.

Instant Cloud GPUs

Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.

Deploy Now

Quantization Estimates

FormatVRAM NeedTier
FP1642.0 GBFull Precision
Q8_021.0 GBHigh
Q6_K17.8 GBExcellent
Q5_K_M14.7 GBGreat
Q4_K_M10.5 GBSweet Spot
Q2_K6.3 GBEmergency

Share this Model

Send these specs directly to your community.

Post