GLM-5.1

Z.ai next-gen flagship for agentic engineering. 744B MoE, 40B active. MIT licensed. #1 open-weight model on SWE-Bench Pro as of April 2026. Trained on Huawei Ascend chips.

Model Specifications

ArchitectureTEXT

Parameters744B

Familyglm

VRAM (Q4)372.0GB

Mixture of ExpertsActive inference parameters: 40B.

#coding#agents#reasoningSource

Estimated Quantization Sizes

Format	Precision	Est. VRAM	Recommendation
FP16 / BF16	16-bit	1488.0 GB	Uncompressed Base
Q8_0High	8-bit	744.0 GB	Near Lossless
Q6_K	6-bit	558.0 GB	Excellent Balance
Q4_K_MPopular	4-bit	372.0 GB	Standard Use

Share this Model

Send this model's specs directly to your community.

Post

Similar Models

GLM-4 9B

9.3B

Tsinghua bilingual model

GLM-4.6

355B

Latest Zhipu flagship MoE model

GLM-4.5

355B

Advanced open-source MoE from Zhipu

Related Guides

How much VRAM do you really need?

A complete breakdown of quantization levels and VRAM overhead for running local models.

Best GPUs for Machine Learning in 2026

Comparing NVIDIA and AMD options for the best speed-to-dollar ratio.

GGUF vs EXL2 vs AWQ

Understanding local AI formats and which one to pick for your specific hardware.