https://store-images.s-microsoft.com/image/apps.10812.6f8af020-d800-4315-ab32-09e073c9d176.071840f7-7e54-4cad-9862-8c27e792bc53.cd5c2a72-9bec-421d-92b4-e2cb1d1d2219

Jina Embeddings v2 Base - de

Jina AI

Jina Embeddings v2 Base - de

Jina AI

Text embedding model (base) for English and German input of size up to 8192 tokens.

  • jina-embeddings-v2-base-de is an open-source bilingual German-English embedding model supporting 8192 sequence length.
  • This state-of-the-art AI embedding model enables many applications, such as document clustering, classification, content personalization, vector search, or retrieval augmented generation.

Highlights:
  • State-of-the-art: This model is designed for high performance in mono-lingual & cross-lingual applications and has been trained specifically to support mixed German-English input without bias.
  • Extended Context: An 8192-token length enables jina-embeddings-v2-base-de to support longer texts and document fragments, far surpassing models that only support a few hundred tokens at a time.
  • Compact Size: jina-embeddings-v2-base-de is built for high performance on standard computer hardware. With only 161 million parameters, the entire model is only 322MB. The embeddings themselves are 768 dimensions, a relatively small vector size compared to many models, saving space and run-time for applications.