rerank-2.5 Reranker
MongoDB, Inc.
rerank-2.5 Reranker
MongoDB, Inc.
rerank-2.5 Reranker
MongoDB, Inc.
Reranker model for refining retrieval/search accuracy with instruction-following. 32K context length
[This offering is not optimized for latency. A latency-optimized version is coming soon.]
High-accuracy instruction-following reranker that refines search results and improves retrieval quality across domains. Supports 32K context length. Throughput varies significantly by workload pattern based on factors like GPU type, model size, sequence length, batch size, and vector dimensionality. Typically we see ~75k~150k tokens/sec for this model on A100 GPUs. We recommend customers benchmark their own throughput and token volume during testing to inform token TCO estimates.
Rerank 2.5:
Outperforms Cohere Rerank v3.5 by 7.94% on 93 benchmark datasets across multiple domains
Introduces instruction-following capability to steer reranking using natural language
Improves accuracy by 12.70% on the Massive Instructed Retrieval Benchmark (MAIR)
Delivers double the context length of rerank-2 (32K vs. 16K) at the same cost
Optimized to enhance results from first-stage retrieval methods like BM25, OpenAI v3-large, voyage-3, and voyage-3.5
Provides a seamless upgrade path from rerank-2 with better quality, broader domain coverage, and no pricing changes