Back to CalculatorDeploy Now
Llama-3.1 Omni 8B
Low latency speech interaction
Specifications
SourceArchitectureAUDIO
Parameters8B
Familyllama
VRAM (Q4)4.0G
audio-chatmetaexperimental
Build your Local Rig
Ready to run locally? Shop top-tier GPUs on Amazon for the best performance.
Instant Cloud GPUs
Running out of VRAM? Rent a high-end H100 or RTX 4090 on RunPod and deploy in seconds.
Share this Model
Send these specs directly to your community.