LocalOps
Back to Calculator

Qwen3 Embedding 8B

Hot

Alibaba's flagship embedding model — #1 on MTEB multilingual leaderboard (score 70.58). 8B decoder-only transformer, 32K context, flexible 32–4096 dimension output. Outperforms all dedicated encoder models on 100+ language retrieval tasks.

Model Specifications

ArchitectureEMBEDDING
Parameters8B
Familyqwen3-embed
VRAM (Q4)4.0GB
Use with vLLM or TGI for high-throughput serving. MRL training allows safe truncation to 1024 dims at ~5% quality cost. Best pick for multilingual RAG.
#alibaba#qwen#embedding#retrieval#rag#multilingual#apache2#trendingSource

Share this Model

Send this model's specs directly to your community.

Post