LocalOps
Back to Calculator

GPT OSS 20B

Hot

OpenAI's compact open-weight MoE reasoning model, matches o3-mini on benchmarks, runs in 16GB VRAM. First major open-weight release from OpenAI.

Model Specifications

ArchitectureTEXT
Parameters21B
Familygpt-oss
VRAM (Q4)16GB
Mixture of ExpertsActive inference parameters: 3.6B.
Apache 2.0. MXFP4 quantized. Configurable reasoning effort (low/medium/high). 128K context. Runs on 16GB GPU.
#openai#reasoning#moe#efficient#apache2#trendingSource

Estimated Quantization Sizes

FormatPrecisionEst. VRAMRecommendation
FP16 / BF1616-bit42.0 GBUncompressed Base
Q8_0High8-bit21.0 GBNear Lossless
Q6_K6-bit15.8 GBExcellent Balance
Q4_K_MPopular4-bit10.5 GBStandard Use

Share this Model

Send this model's specs directly to your community.

Post