LocalOps LogoLocalOps
Back to Calculator

Voxtral TTS 4B

Hot

Mistral's open-weight text-to-speech model with 9-language support, voice cloning from 3s of audio, and 70ms latency

Specifications

Source
ArchitectureAUDIO
Parameters4B
Familyvoxtral
VRAM (Q4)8G
Runs on single GPU with >=16GB VRAM. Architecture: 3.4B transformer + 390M flow-matching + 300M codec
ttsmistralvoice-cloningtrending

Build your Local Rig

Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.

Instant Cloud GPUs

Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.

Deploy Now

Share this Model

Send these specs directly to your community.

Post