https://store-images.s-microsoft.com/image/apps.10812.231074ba-a024-4951-81a9-2b076bcd8e7c.618dd513-ef77-40a5-ab2b-b172336ba8ed.0ed5d9d4-7091-4ae4-a80d-609236d232ab
Jina Embeddings v3
Jina AI
Jina Embeddings v3
Jina AI
Jina Embeddings v3
Jina AI
New State-of-the-Art Multilingual Embeddings With Task LoRA
- jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications.
- Based on the Jina-XLM-RoBERTa architecture, this model supports Rotary Position Embeddings to handle long input sequences up to 8192 tokens.
- Additionally, it features 5 LoRA adapters to generate task-specific embeddings efficiently.
Highlights:
Extended Sequence Length: Supports up to 8192 tokens with RoPE.
Task-Specific Embedding: Customize embeddings through the task argument with the following options:
- retrieval.passage: Used for passage embeddings in asymmetric retrieval tasks
- separation: Used for embeddings in clustering and re-ranking applications
- classification: Used for embeddings in classification tasks
- text-matching: Used for embeddings in tasks that quantify similarity between two texts, such as STS or symmetric retrieval tasks
Matryoshka Embeddings: Supports flexible embedding sizes (32, 64, 128, 256, 512, 768, 1024), allowing for truncating embeddings to fit your application.