ml6team
/

keyphrase-extraction-kbir-inspec

@@ -33,9 +33,12 @@ model-index:
       type: midas/inspec
       name: inspec
     metrics:
-      - type: seqeval
         value: 0.588
-        name: F1-score
 ---
 # 🔑 Keyphrase Extraction Model: KBIR-inspec
 Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a document. Thanks to these keyphrases humans can understand the content of a text very quickly and easily without reading it completely. Keyphrase extraction was first done primarily by human annotators, who read the text in detail and then wrote down the most important keyphrases. The disadvantage is that if you work with a lot of documents, this process can take a lot of time ⏳.
@@ -104,22 +107,22 @@ extractor = KeyphraseExtractionPipeline(model=model_name)
 ```python
 # Inference
 text = """
-Keyphrase extraction is a technique in text analysis where you extract the
-important keyphrases from a document. Thanks to these keyphrases humans can
-understand the content of a text very quickly and easily without reading it
-completely. Keyphrase extraction was first done primarily by human annotators,
-who read the text in detail and then wrote down the most important keyphrases.
-The disadvantage is that if you work with a lot of documents, this process
 can take a lot of time.
-Here is where Artificial Intelligence comes in. Currently, classical machine
-learning methods, that use statistical and linguistic features, are widely used
-for the extraction process. Now with deep learning, it is possible to capture
-the semantic meaning of a text even better than these classical methods.
-Classical methods look at the frequency, occurrence and order of words
-in the text, whereas these neural approaches can capture long-term
 semantic dependencies and context of words in a text.
-"""
 keyphrases = extractor(text)
@@ -130,7 +133,8 @@ print(keyphrases)
 ```
 # Output
 ['Artificial Intelligence' 'Keyphrase extraction' 'deep learning'
- 'features' 'text analysis']
 ```
 ## 📚 Training Dataset
@@ -213,8 +217,8 @@ tokenized_dataset = dataset.map(preprocess_fuction, batched=True)
 ```
-### Postprocessing
-For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive Bs and Is. As last you strip the keyphrases to ensure all spaces are removed.
 ```python
 # Define post_process functions
 def concat_tokens_by_tag(keyphrases):

       type: midas/inspec
       name: inspec
     metrics:
+      - type: F1 (Seqeval)
         value: 0.588
+        name: F1 (Seqeval)
+      - type: F1@M
+        value: 0.564
+        name: F1@M
 ---
 # 🔑 Keyphrase Extraction Model: KBIR-inspec
 Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a document. Thanks to these keyphrases humans can understand the content of a text very quickly and easily without reading it completely. Keyphrase extraction was first done primarily by human annotators, who read the text in detail and then wrote down the most important keyphrases. The disadvantage is that if you work with a lot of documents, this process can take a lot of time ⏳.
 ```python
 # Inference
 text = """
+Keyphrase extraction is a technique in text analysis where you extract the
+important keyphrases from a document. Thanks to these keyphrases humans can
+understand the content of a text very quickly and easily without reading it
+completely. Keyphrase extraction was first done primarily by human annotators,
+who read the text in detail and then wrote down the most important keyphrases.
+The disadvantage is that if you work with a lot of documents, this process
 can take a lot of time.
+Here is where Artificial Intelligence comes in. Currently, classical machine
+learning methods, that use statistical and linguistic features, are widely used
+for the extraction process. Now with deep learning, it is possible to capture
+the semantic meaning of a text even better than these classical methods.
+Classical methods look at the frequency, occurrence and order of words
+in the text, whereas these neural approaches can capture long-term
 semantic dependencies and context of words in a text.
+""".replace("\n", " ")
 keyphrases = extractor(text)
 ```
 # Output
 ['Artificial Intelligence' 'Keyphrase extraction' 'deep learning'
+ 'linguistic features' 'machine learning' 'semantic meaning'
+ 'text analysis']
 ```
 ## 📚 Training Dataset
 ```
+### Postprocessing (Without Pipeline Function)
+If you do not use the pipeline function, you must filter out the B and I labeled tokens. Each B and I will then be merged into a keyphrase. Finally, you need to strip the keyphrases to make sure all unnecessary spaces have been removed.
 ```python
 # Define post_process functions
 def concat_tokens_by_tag(keyphrases):