pascalhuerten commited on
Commit
d71e600
1 Parent(s): eb63880

Add short description and example for skill retrieval task

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -2523,8 +2523,26 @@ model-index:
2523
  - type: max_f1
2524
  value: 78.39889075384951
2525
  ---
 
 
2526
 
2527
- # hkunlp/instructor-base
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2528
  We introduce **Instructor**👨‍🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨‍ achieves sota on 70 diverse embedding tasks!
2529
  The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!
2530
 
 
2523
  - type: max_f1
2524
  value: 78.39889075384951
2525
  ---
2526
+ # pascalhuerten/instructor-skillfit
2527
+ A finetuning of hkunlp/instructor-base specialized on performing retrival of relevant skills based on a given learning outcome.
2528
 
2529
+ ## Skill Retrieval
2530
+ You can use **customized embeddings** for skill retrieval.
2531
+ ```python
2532
+ import numpy as np
2533
+ from sklearn.metrics.pairwise import cosine_similarity
2534
+ query = [['Represent the learning outcome for retrieval: : ','WordPress installieren\nWebsite- oder Blogplanung\nPlugins und Widges einfügen']]
2535
+ corpus = [['Represent the skill for retrieval: ','WordPress'],
2536
+ ['Represent the skill for retrieval: ','Website-Wireframe erstellen'],
2537
+ ['Represent the skill for retrieval: ','Software für Content-Management-Systeme nutzen']]
2538
+ query_embeddings = model.encode(query)
2539
+ corpus_embeddings = model.encode(corpus)
2540
+ similarities = cosine_similarity(query_embeddings,corpus_embeddings)
2541
+ retrieved_doc_id = np.argmax(similarities)
2542
+ print(retrieved_doc_id)
2543
+ ```
2544
+
2545
+ ## hkunlp/instructor-base
2546
  We introduce **Instructor**👨‍🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨‍ achieves sota on 70 diverse embedding tasks!
2547
  The model is easy to use with **our customized** `sentence-transformer` library. For more details, check out [our paper](https://arxiv.org/abs/2212.09741) and [project page](https://instructor-embedding.github.io/)!
2548