Back to Calculator
Qwen3.5-Omni Plus
HotAlibaba's flagship omni-modal model — processes text, images, audio, and video natively. Thinker-Talker MoE architecture with real-time streaming speech output, 256K context, 113 speech recognition languages.
Model Specifications
ArchitectureAUDIO
Parameters30B
Familyqwen3.5
VRAM (Q4)15.0GB
Mixture of ExpertsActive inference parameters: 3B.
Plus (30B-A3B) and Flash variants are API-only via DashScope as of March 31 2026. Weights not yet confirmed publicly.
Share this Model
Send this model's specs directly to your community.
Similar Models
Qwen3.5-Omni Light
7BAlibaba's compact open-weight omni-modal model. Handles text, image, audio, and video in a single inference pass — self-hostable on HuggingFace under Apache 2.0.
Qwen 3 Omni
30BEnd-to-end voice/text/vision interaction
Qwen3-TTS VoiceDesign (1.7B)
1.7BZero-shot voice design from text descriptions
Related Guides
How much VRAM do you really need?
A complete breakdown of quantization levels and VRAM overhead for running local models.
Best GPUs for Machine Learning in 2026
Comparing NVIDIA and AMD options for the best speed-to-dollar ratio.
GGUF vs EXL2 vs AWQ
Understanding local AI formats and which one to pick for your specific hardware.