Back to CalculatorDeploy Now
Qwen3.6 35B-A3B
First open-weight Qwen3.6 model. 35B total / 3B active MoE, focused on agentic coding and repo-level reasoning. Native 262K context, extensible to 1M tokens. Apache 2.0.
Specifications
SourceArchitectureTEXT
Parameters35B
Familyqwen3.6
VRAM (Q4)17.5G
MoE: 3B active.
codingagentsthinking
Build your Local Rig
Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.
Instant Cloud GPUs
Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.
Quantization Estimates
| Format | VRAM Need | Tier |
|---|---|---|
| FP16 | 70.0 GB | Full Precision |
| Q8_0 | 35.0 GB | High |
| Q6_K | 29.8 GB | Excellent |
| Q5_K_M | 24.5 GB | Great |
| Q4_K_M | 17.5 GB | Sweet Spot |
| Q2_K | 10.5 GB | Emergency |
Share this Model
Send these specs directly to your community.
Similar Models
Qwen3.6 Plus
0BAlibaba's next-gen hybrid-architecture flagship, released as a free preview on OpenRouter (March 31 2026). Always-on chain-of-thought, 1M token context, up to 65K output tokens — built for agentic coding and long-document workflows.
Llama 3.1 8B
8.03BBest small model for most tasks
Qwen 2.5 72B
72.7BTop-tier reasoning and coding