Sentence Similarity
sentence-transformers
PyTorch
TensorBoard
Transformers
English
German
t5
text-embedding
embeddings
information-retrieval
beir
text-classification
language-model
text-clustering
text-semantic-similarity
text-evaluation
prompt-retrieval
text-reranking
feature-extraction
English
Sentence Similarity
natural_questions
ms_marco
fever
hotpot_qa
mteb
pascalhuerten
commited on
Commit
•
d71e600
1
Parent(s):
eb63880
Add short description and example for skill retrieval task
Browse files
README.md
CHANGED
@@ -2523,8 +2523,26 @@ model-index:
|
|
2523 |
- type: max_f1
|
2524 |
value: 78.39889075384951
|
2525 |
---
|
|
|
|
|
2526 |
|
2527 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2528 |
We introduce **Instructor**👨🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨 achieves sota on 70 diverse embedding tasks!
|
2529 |
The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!
|
2530 |
|
|
|
2523 |
- type: max_f1
|
2524 |
value: 78.39889075384951
|
2525 |
---
|
2526 |
+
# pascalhuerten/instructor-skillfit
|
2527 |
+
A finetuning of hkunlp/instructor-base specialized on performing retrival of relevant skills based on a given learning outcome.
|
2528 |
|
2529 |
+
## Skill Retrieval
|
2530 |
+
You can use **customized embeddings** for skill retrieval.
|
2531 |
+
```python
|
2532 |
+
import numpy as np
|
2533 |
+
from sklearn.metrics.pairwise import cosine_similarity
|
2534 |
+
query = [['Represent the learning outcome for retrieval: : ','WordPress installieren\nWebsite- oder Blogplanung\nPlugins und Widges einfügen']]
|
2535 |
+
corpus = [['Represent the skill for retrieval: ','WordPress'],
|
2536 |
+
['Represent the skill for retrieval: ','Website-Wireframe erstellen'],
|
2537 |
+
['Represent the skill for retrieval: ','Software für Content-Management-Systeme nutzen']]
|
2538 |
+
query_embeddings = model.encode(query)
|
2539 |
+
corpus_embeddings = model.encode(corpus)
|
2540 |
+
similarities = cosine_similarity(query_embeddings,corpus_embeddings)
|
2541 |
+
retrieved_doc_id = np.argmax(similarities)
|
2542 |
+
print(retrieved_doc_id)
|
2543 |
+
```
|
2544 |
+
|
2545 |
+
## hkunlp/instructor-base
|
2546 |
We introduce **Instructor**👨🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨 achieves sota on 70 diverse embedding tasks!
|
2547 |
The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!
|
2548 |
|