LocalOps
Back to Calculator

Voxtral TTS 4B

Hot

Mistral's open-weight text-to-speech model with 9-language support, voice cloning from 3s of audio, and 70ms latency

Model Specifications

ArchitectureAUDIO
Parameters4B
Familyvoxtral
VRAM (Q4)8GB
Runs on single GPU with >=16GB VRAM. Architecture: 3.4B transformer + 390M flow-matching + 300M codec
#tts#mistral#voice-cloning#trendingSource

Share this Model

Send this model's specs directly to your community.

Post