https://store-images.s-microsoft.com/image/apps.19458.59660168-35ec-4f22-9259-2105d7940a1c.65c347e7-d71d-4649-a0e3-479bbe41eda5.2947c5f8-2418-4180-98fb-10c55d70f09c
voyage-3.5-lite
MongoDB, Inc.
voyage-3.5-lite
MongoDB, Inc.
voyage-3.5-lite
MongoDB, Inc.
Embedding model for general-purpose (incl multilingual) retrieval/search and AI. 32K context length
Text embedding model optimized for general-purpose retrieval quality, latency, and cost for AI applications. 32K context length.
Throughput varies significantly by workload pattern based on factors like GPU type, model size, sequence length, batch size, and vector dimensionality. Typically we see ~75k~150k tokens/sec for this model on A100 GPUs. We recommend customers benchmark their own throughput and token volume during testing to inform token TCO estimates.
- Outperforms OpenAI-v3-large and voyage-3-lite by an average of 6.34% and 4.28% respectively across domains
- Supports embeddings of 2048, 1024, 512, and 256 dimensions
- Includes float, int8, uint8, and binary quantization options
- Maintains a 32K-token context length at the same price point as voyage-3-lite
- Achieves quality within 0.3% of Cohere-v4 at 1/6 the cost
- Cuts vector database costs by up to 83% (int8, 2048) or 99% (binary, 1024) compared to OpenAI-v3-large