Update README.md
Browse filesInclude binary MRL in code examples and link to the blog post
README.md
CHANGED
@@ -2621,7 +2621,7 @@ This is our base sentence embedding model. It was trained using [AnglE](https://
|
|
2621 |
|
2622 |
## Quickstart
|
2623 |
|
2624 |
-
Here, we provide several ways to produce sentence embeddings. Please note that you have to provide the prompt `Represent this sentence for searching relevant passages:` for query if you want to use it for retrieval. Besides that you don't need any prompt.
|
2625 |
|
2626 |
### sentence-transformers
|
2627 |
|
@@ -2632,6 +2632,7 @@ python -m pip install -U sentence-transformers
|
|
2632 |
```python
|
2633 |
from sentence_transformers import SentenceTransformer
|
2634 |
from sentence_transformers.util import cos_sim
|
|
|
2635 |
|
2636 |
# 1. load model
|
2637 |
model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
|
@@ -2652,6 +2653,16 @@ embeddings = model.encode(docs)
|
|
2652 |
|
2653 |
similarities = cos_sim(embeddings[0], embeddings[1:])
|
2654 |
print('similarities:', similarities)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2655 |
```
|
2656 |
### Transformers
|
2657 |
|
@@ -2669,7 +2680,7 @@ def transform_query(query: str) -> str:
|
|
2669 |
"""
|
2670 |
return f'Represent this sentence for searching relevant passages: {query}'
|
2671 |
|
2672 |
-
# The model works really well with cls pooling (default) but also with mean
|
2673 |
def pooling(outputs: torch.Tensor, inputs: Dict, strategy: str = 'cls') -> np.ndarray:
|
2674 |
if strategy == 'cls':
|
2675 |
outputs = outputs[:, 0]
|
@@ -2744,7 +2755,6 @@ You can use the model via our API as follows:
|
|
2744 |
|
2745 |
```python
|
2746 |
from mixedbread_ai.client import MixedbreadAI
|
2747 |
-
from sklearn.metrics.pairwise import cosine_similarity
|
2748 |
import os
|
2749 |
|
2750 |
mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
|
@@ -2756,16 +2766,21 @@ english_sentences = [
|
|
2756 |
|
2757 |
res = mxbai.embeddings(
|
2758 |
input=english_sentences,
|
2759 |
-
model="mixedbread-ai/mxbai-embed-large-v1"
|
|
|
|
|
|
|
2760 |
)
|
2761 |
-
embeddings = [entry.embedding for entry in res.data]
|
2762 |
|
2763 |
-
|
2764 |
-
print(similarities)
|
2765 |
```
|
2766 |
|
2767 |
The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
|
2768 |
|
|
|
|
|
|
|
|
|
2769 |
## Evaluation
|
2770 |
As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
|
2771 |
|
|
|
2621 |
|
2622 |
## Quickstart
|
2623 |
|
2624 |
+
Here, we provide several ways to produce sentence embeddings. Please note that you have to provide the prompt `Represent this sentence for searching relevant passages:` for query if you want to use it for retrieval. Besides that you don't need any prompt. Our model also supports Matryoshka Representation Learning and (binary) quantization when used via API or Sentence Transformers.
|
2625 |
|
2626 |
### sentence-transformers
|
2627 |
|
|
|
2632 |
```python
|
2633 |
from sentence_transformers import SentenceTransformer
|
2634 |
from sentence_transformers.util import cos_sim
|
2635 |
+
from sentence_transformers.util import quantize_embeddings
|
2636 |
|
2637 |
# 1. load model
|
2638 |
model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
|
|
|
2653 |
|
2654 |
similarities = cos_sim(embeddings[0], embeddings[1:])
|
2655 |
print('similarities:', similarities)
|
2656 |
+
|
2657 |
+
# 2a. Encode with selection of MRL dimensions
|
2658 |
+
mrl_embeddings = model.encode(docs, normalize_embeddings=True)[..., :512]
|
2659 |
+
|
2660 |
+
mrl_similarities = cos_sim(mrl_embeddings[0], mrl_embeddings[1:])
|
2661 |
+
print('mrl_similarities:', mrl_similarities)
|
2662 |
+
|
2663 |
+
# 3. Apply binary quantization
|
2664 |
+
binary_embeddings = quantize_embeddings(embeddings, precision="binary")
|
2665 |
+
binary_mrl_embeddings = quantize_embeddings(mrl_embeddings, precision="binary")
|
2666 |
```
|
2667 |
### Transformers
|
2668 |
|
|
|
2680 |
"""
|
2681 |
return f'Represent this sentence for searching relevant passages: {query}'
|
2682 |
|
2683 |
+
# The model works really well with cls pooling (default) but also with mean pooling.
|
2684 |
def pooling(outputs: torch.Tensor, inputs: Dict, strategy: str = 'cls') -> np.ndarray:
|
2685 |
if strategy == 'cls':
|
2686 |
outputs = outputs[:, 0]
|
|
|
2755 |
|
2756 |
```python
|
2757 |
from mixedbread_ai.client import MixedbreadAI
|
|
|
2758 |
import os
|
2759 |
|
2760 |
mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
|
|
|
2766 |
|
2767 |
res = mxbai.embeddings(
|
2768 |
input=english_sentences,
|
2769 |
+
model="mixedbread-ai/mxbai-embed-large-v1",
|
2770 |
+
normalized=True,
|
2771 |
+
encoding_format=['ubinary','float'],
|
2772 |
+
dimensions=512
|
2773 |
)
|
|
|
2774 |
|
2775 |
+
print(res.dimensions, res.data[0].embedding.ubinary, res.data[0].embedding.float_)
|
|
|
2776 |
```
|
2777 |
|
2778 |
The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
|
2779 |
|
2780 |
+
### Why binary MRL?
|
2781 |
+
|
2782 |
+
The combination of binary quantization and Matryoshka Representation Learning allows you to reduce the memory usage of your embeddings significantly. This leads to much lower costs when using a vector database. You can read more about the technology and its advantages in our [blog post](https://www.mixedbread.ai/blog/binary-mrl).
|
2783 |
+
|
2784 |
## Evaluation
|
2785 |
As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
|
2786 |
|