DeDeckerThomas commited on
Commit
94ec415
β€’
1 Parent(s): 0a79f9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -18
README.md CHANGED
@@ -33,9 +33,12 @@ model-index:
33
  type: midas/inspec
34
  name: inspec
35
  metrics:
36
- - type: seqeval
37
  value: 0.588
38
- name: F1-score
 
 
 
39
  ---
40
  # πŸ”‘ Keyphrase Extraction Model: KBIR-inspec
41
  Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a document. Thanks to these keyphrases humans can understand the content of a text very quickly and easily without reading it completely. Keyphrase extraction was first done primarily by human annotators, who read the text in detail and then wrote down the most important keyphrases. The disadvantage is that if you work with a lot of documents, this process can take a lot of time ⏳.
@@ -104,22 +107,22 @@ extractor = KeyphraseExtractionPipeline(model=model_name)
104
  ```python
105
  # Inference
106
  text = """
107
- Keyphrase extraction is a technique in text analysis where you extract the
108
- important keyphrases from a document. Thanks to these keyphrases humans can
109
- understand the content of a text very quickly and easily without reading it
110
- completely. Keyphrase extraction was first done primarily by human annotators,
111
- who read the text in detail and then wrote down the most important keyphrases.
112
- The disadvantage is that if you work with a lot of documents, this process
113
  can take a lot of time.
114
 
115
- Here is where Artificial Intelligence comes in. Currently, classical machine
116
- learning methods, that use statistical and linguistic features, are widely used
117
- for the extraction process. Now with deep learning, it is possible to capture
118
- the semantic meaning of a text even better than these classical methods.
119
- Classical methods look at the frequency, occurrence and order of words
120
- in the text, whereas these neural approaches can capture long-term
121
  semantic dependencies and context of words in a text.
122
- """
123
 
124
  keyphrases = extractor(text)
125
 
@@ -130,7 +133,8 @@ print(keyphrases)
130
  ```
131
  # Output
132
  ['Artificial Intelligence' 'Keyphrase extraction' 'deep learning'
133
- 'features' 'text analysis']
 
134
  ```
135
 
136
  ## πŸ“š Training Dataset
@@ -213,8 +217,8 @@ tokenized_dataset = dataset.map(preprocess_fuction, batched=True)
213
 
214
  ```
215
 
216
- ### Postprocessing
217
- For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive Bs and Is. As last you strip the keyphrases to ensure all spaces are removed.
218
  ```python
219
  # Define post_process functions
220
  def concat_tokens_by_tag(keyphrases):
 
33
  type: midas/inspec
34
  name: inspec
35
  metrics:
36
+ - type: F1 (Seqeval)
37
  value: 0.588
38
+ name: F1 (Seqeval)
39
+ - type: F1@M
40
+ value: 0.564
41
+ name: F1@M
42
  ---
43
  # πŸ”‘ Keyphrase Extraction Model: KBIR-inspec
44
  Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a document. Thanks to these keyphrases humans can understand the content of a text very quickly and easily without reading it completely. Keyphrase extraction was first done primarily by human annotators, who read the text in detail and then wrote down the most important keyphrases. The disadvantage is that if you work with a lot of documents, this process can take a lot of time ⏳.
 
107
  ```python
108
  # Inference
109
  text = """
110
+ Keyphrase extraction is a technique in text analysis where you extract the
111
+ important keyphrases from a document. Thanks to these keyphrases humans can
112
+ understand the content of a text very quickly and easily without reading it
113
+ completely. Keyphrase extraction was first done primarily by human annotators,
114
+ who read the text in detail and then wrote down the most important keyphrases.
115
+ The disadvantage is that if you work with a lot of documents, this process
116
  can take a lot of time.
117
 
118
+ Here is where Artificial Intelligence comes in. Currently, classical machine
119
+ learning methods, that use statistical and linguistic features, are widely used
120
+ for the extraction process. Now with deep learning, it is possible to capture
121
+ the semantic meaning of a text even better than these classical methods.
122
+ Classical methods look at the frequency, occurrence and order of words
123
+ in the text, whereas these neural approaches can capture long-term
124
  semantic dependencies and context of words in a text.
125
+ """.replace("\n", " ")
126
 
127
  keyphrases = extractor(text)
128
 
 
133
  ```
134
  # Output
135
  ['Artificial Intelligence' 'Keyphrase extraction' 'deep learning'
136
+ 'linguistic features' 'machine learning' 'semantic meaning'
137
+ 'text analysis']
138
  ```
139
 
140
  ## πŸ“š Training Dataset
 
217
 
218
  ```
219
 
220
+ ### Postprocessing (Without Pipeline Function)
221
+ If you do not use the pipeline function, you must filter out the B and I labeled tokens. Each B and I will then be merged into a keyphrase. Finally, you need to strip the keyphrases to make sure all unnecessary spaces have been removed.
222
  ```python
223
  # Define post_process functions
224
  def concat_tokens_by_tag(keyphrases):