Aamir
commited on
Commit
•
82f6eee
1
Parent(s):
fef70ba
Update README.md
Browse files
README.md
CHANGED
@@ -2614,13 +2614,15 @@ library_name: transformers
|
|
2614 |
|
2615 |
# 🪆mxbai-embed-2d-large-v1🪆
|
2616 |
|
2617 |
-
This is our [2DMSE](https://arxiv.org/abs/2402.14776) sentence embedding model. It supports the adaptive
|
|
|
|
|
2618 |
|
2619 |
## Quickstart
|
2620 |
|
2621 |
Here, we provide several ways to produce sentence embeddings with adaptive layers and embedding sizes. **For this version, it is recommended to set adaptive layers from 20 to 24.**
|
2622 |
|
2623 |
-
###
|
2624 |
|
2625 |
Currently, the best way to use our models is with the most recent version of sentence-transformers.
|
2626 |
|
@@ -2640,9 +2642,10 @@ model = SentenceTransformer(modules=[word_embedding_model, pooling_model])
|
|
2640 |
|
2641 |
# 2. set adaptive layer and embedding size.
|
2642 |
# it is recommended to set layers from 20 to 24.
|
2643 |
-
new_num_layers = 22 #
|
2644 |
model[0].auto_model.encoder.layer = model[0].auto_model.encoder.layer[:new_num_layers]
|
2645 |
-
|
|
|
2646 |
|
2647 |
# 3. encode
|
2648 |
embeddings = model.encode(
|
@@ -2658,7 +2661,7 @@ similarities = cos_sim(embeddings[0, :new_embedding_size], embeddings[1, :new_em
|
|
2658 |
print('similarities:', similarities)
|
2659 |
```
|
2660 |
|
2661 |
-
###
|
2662 |
|
2663 |
You can also use the lastest `angle-emb` for inference, as follows:
|
2664 |
|
@@ -2692,3 +2695,13 @@ print('similarities:', similarities)
|
|
2692 |
|
2693 |
You’ll be able to use the models through our API as well. The API is coming soon and will have some exciting features. Stay tuned!
|
2694 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2614 |
|
2615 |
# 🪆mxbai-embed-2d-large-v1🪆
|
2616 |
|
2617 |
+
This is our [2DMSE](https://arxiv.org/abs/2402.14776) sentence embedding model. It supports the adaptive transformer layer and embedding size. Find out more in our [blog post](https://mixedbread.ai/blog/2d-mse).
|
2618 |
+
|
2619 |
+
TLDR: TLDR: 2D-🪆 allows you to shrink the model and the embeddings layer. Shrinking only the embeddings model yields competetive results to other models like [nomics embeddings model](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5). Shrinking the model to ~50% maintains upto 85% of the performance without further training.
|
2620 |
|
2621 |
## Quickstart
|
2622 |
|
2623 |
Here, we provide several ways to produce sentence embeddings with adaptive layers and embedding sizes. **For this version, it is recommended to set adaptive layers from 20 to 24.**
|
2624 |
|
2625 |
+
### sentence-transformers
|
2626 |
|
2627 |
Currently, the best way to use our models is with the most recent version of sentence-transformers.
|
2628 |
|
|
|
2642 |
|
2643 |
# 2. set adaptive layer and embedding size.
|
2644 |
# it is recommended to set layers from 20 to 24.
|
2645 |
+
new_num_layers = 22 # 1D: set layer size
|
2646 |
model[0].auto_model.encoder.layer = model[0].auto_model.encoder.layer[:new_num_layers]
|
2647 |
+
|
2648 |
+
new_embedding_size = 768 # 2d: set embedding size
|
2649 |
|
2650 |
# 3. encode
|
2651 |
embeddings = model.encode(
|
|
|
2661 |
print('similarities:', similarities)
|
2662 |
```
|
2663 |
|
2664 |
+
### angle-emb
|
2665 |
|
2666 |
You can also use the lastest `angle-emb` for inference, as follows:
|
2667 |
|
|
|
2695 |
|
2696 |
You’ll be able to use the models through our API as well. The API is coming soon and will have some exciting features. Stay tuned!
|
2697 |
|
2698 |
+
## Evaluation
|
2699 |
+
|
2700 |
+
Please find more information in our [blog post](https://mixedbread.ai/blog/2d-mse)
|
2701 |
+
|
2702 |
+
## Community
|
2703 |
+
Please join our [Discord Community](https://discord.gg/jDfMHzAVfU) and share your feedback and thoughts! We are here to help and also always happy to chat.
|
2704 |
+
|
2705 |
+
## License
|
2706 |
+
Apache 2.0
|
2707 |
+
|