jupyterjazz commited on
Commit
041836f
1 Parent(s): 66e357b
Files changed (1) hide show
  1. README.md +49 -5
README.md CHANGED
@@ -120,7 +120,7 @@ library_name: transformers
120
 
121
  ## Quick Start
122
 
123
- The easiest way to starting using `jina-embeddings-v3` is to use Jina AI's [Embedding API](https://jina.ai/embeddings/).
124
 
125
 
126
  ## Intended Usage & Model Info
@@ -201,7 +201,7 @@ embeddings = F.normalize(embeddings, p=2, dim=1)
201
  </p>
202
  </details>
203
 
204
- The easiest way to start using `jina-embeddings-v3` is Jina AI's [Embeddings API](https://jina.ai/embeddings/).
205
 
206
  Alternatively, you can use `jina-embeddings-v3` directly via Transformers package:
207
  ```python
@@ -254,17 +254,61 @@ The latest version (#todo: specify version) of SentenceTransformers also support
254
  from sentence_transformers import SentenceTransformer
255
 
256
  model = SentenceTransformer(
257
- "jinaai/jina-embeddings-v3", trust_remote_code=True
 
 
 
 
 
258
  )
259
 
260
- embeddings = model.encode(['How is the weather today?'], task_type='retrieval.query')
261
  ```
262
 
263
 
264
 
265
  ## Performance
266
 
267
- TODO UPDATE THIS
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
268
 
269
  ## Contact
270
 
 
120
 
121
  ## Quick Start
122
 
123
+ The easiest way to start using `jina-embeddings-v3` is Jina AI's [Embedding API](https://jina.ai/embeddings/).
124
 
125
 
126
  ## Intended Usage & Model Info
 
201
  </p>
202
  </details>
203
 
204
+ The easiest way to start using `jina-embeddings-v3` is Jina AI's [Embedding API](https://jina.ai/embeddings/).
205
 
206
  Alternatively, you can use `jina-embeddings-v3` directly via Transformers package:
207
  ```python
 
254
  from sentence_transformers import SentenceTransformer
255
 
256
  model = SentenceTransformer(
257
+ "jinaai/jina-embeddings-v3",
258
+ prompts={
259
+ "retrieval.query": "Represent the query for retrieving evidence documents: ",
260
+ "retrieval.passage": "Represent the document for retrieval: ",
261
+ },
262
+ trust_remote_code=True
263
  )
264
 
265
+ embeddings = model.encode(['What is the weather like in Berlin today?'], task_type='retrieval.query')
266
  ```
267
 
268
 
269
 
270
  ## Performance
271
 
272
+ ### English MTEB
273
+ | Model | Average | Classification | Clustering | Pair Classification | Reranking | Retrieval | STS | Summarization |
274
+ |:------------------------------:|:-------:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
275
+ | jina-embeddings-v2-en | 58.12 | 68.82| 40.08| 84.44| 55.09| 45.64| 80.00| 30.56|
276
+ | jina-embeddings-v3 | **65.60** | **82.58**| 45.27| 84.01| 58.13| 53.87| **85.8** | 30.98|
277
+ | text-embedding-3-large | 62.03 | 75.45| 49.01| 84.22| 59.16| 55.44| 81.04| 29.92|
278
+ | multilingual-e5-large-instruct | 64.41 | 77.56| 47.1 | 86.19| 58.58| 52.47| 84.78| 30.39|
279
+ | Cohere-embed-multilingual-v3.0 | 60.08 | 64.01| 46.6 | 86.15| 57.86| 53.84| 83.15| 30.99|
280
+
281
+ ### Multilingual MTEB
282
+
283
+ | Model | Average | Classification | Clustering | Pair Classification | Reranking | Retrieval | STS | Summarization |
284
+ |:------------------------------:|:-------:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
285
+ | jina-embeddings-v2 | 60.54 | 65.69| 39.36| **82.95**| 66.57| 58.24| 66.6 | - |
286
+ | jina-embeddings-v3 | **64.44** | **71.46**| 46.71| 76.91| 63.98| 57.98| **69.83**| - |
287
+ | multilingual-e5-large | 59.58 | 65.22| 42.12| 76.95| 63.4 | 52.37| 64.65| - |
288
+ | multilingual-e5-large-instruct | 64.25 | 67.45| **52.12**| 77.79| **69.02**| **58.38**| 68.77| - |
289
+
290
+
291
+ ### Long Context Tasks (LongEmbed)
292
+
293
+ | Model | Average | NarrativeQA | Needle | Passkey | QMSum | SummScreen | WikiQA |
294
+ |:--------------------:|:-------:|:-----------:|:------:|:-------:|:-----:|:----------:|:------:|
295
+ | jina-embeddings-v3* | **70.39** | 33.32 | **84.00** | **100.00** | **39.75** | 92.78 | 72.46 |
296
+ | jina-embeddings-v2 | 58.12 | 37.89 | 54.25 | 50.25 | 38.87 | 93.48 | 73.99 |
297
+ | text-embedding-3-large | 51.3 | 44.09 | 29.25 | 63.00 | 32.49 | 84.80 | 54.16 |
298
+ | baai-bge-m3 | 56.56 | **45.76** | 40.25 | 46.00 | 35.54 | **94.09** | **77.73** |
299
+
300
+ **Notes:**
301
+ - `*`: text-matching adapter
302
+
303
+
304
+ #### Matryoshka Embeddings
305
+
306
+ | Task | 32 | 64 | 128 | 256 | 512 | 768 | 1024 |
307
+ |:-------------:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
308
+ | Retrieval | 52.54| 58.54| 61.64| 62.72| 63.16| 63.30| 63.35|
309
+ | STS | 76.35| 77.03| 77.43| 77.56| 77.59| 77.59| 77.58|
310
+
311
+ For a comprehensive evaluation and detailed metrics, please refer to the full paper available here (coming soon).
312
 
313
  ## Contact
314