DeDeckerThomas commited on
Commit
4f19e0a
β€’
1 Parent(s): 4cebcbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -6
README.md CHANGED
@@ -54,20 +54,29 @@ def extract_keyphrases(example, predictions, tokenizer, index=0):
54
  )
55
  return np.unique([kp.strip() for kp in extracted_kps])
56
 
 
 
 
57
  # Load model and tokenizer
58
  model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
59
  tokenizer = AutoTokenizer.from_pretrained(model_name)
60
  model = AutoModelForTokenClassification.from_pretrained(model_name)
61
-
 
62
  # Inference
63
  text = """
64
- """.replace(
65
- "\n", ""
66
- )
 
 
 
 
 
 
67
 
68
  encoded_input = tokenizer(
69
- text.split(" "),
70
- is_split_into_words=True,
71
  truncation=True,
72
  padding="max_length",
73
  max_length=max_length,
@@ -87,6 +96,24 @@ print("***** Prediction *****")
87
  print(extracted_kps)
88
  ```
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  ## πŸ“š Trainig Dataset
91
  ## πŸ‘·β€β™‚οΈ Training procedure
92
 
 
54
  )
55
  return np.unique([kp.strip() for kp in extracted_kps])
56
 
57
+ ```
58
+
59
+ ```python
60
  # Load model and tokenizer
61
  model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
62
  tokenizer = AutoTokenizer.from_pretrained(model_name)
63
  model = AutoModelForTokenClassification.from_pretrained(model_name)
64
+ ```
65
+ ```python
66
  # Inference
67
  text = """
68
+ Keyword extraction is a technique in text analysis where you extract the important keywords
69
+ from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
70
+ Currently, classical machine learning methods, that use statistics and linguistics, are widely used
71
+ for the extraction process. The fact that these methods have been widely used in the community has
72
+ the advantage that there are many easy-to-use libraries. Now with the recent innovations in
73
+ deep learning methods (such as recurrent neural networks and transformers, GANS, …),
74
+ keyword extraction can be improved. These new methods also focus on the semantics
75
+ and context of a document, which is quite an improvement.
76
+ """.replace("\n", "")
77
 
78
  encoded_input = tokenizer(
79
+ text,
 
80
  truncation=True,
81
  padding="max_length",
82
  max_length=max_length,
 
96
  print(extracted_kps)
97
  ```
98
 
99
+ ```
100
+ ***** Input Document *****
101
+ Keyword extraction is a technique in text analysis where you extract the important keywords
102
+ from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
103
+ Currently, classical machine learning methods, that use statistics and linguistics, are widely used
104
+ for the extraction process. The fact that these methods have been widely used in the community has
105
+ the advantage that there are many easy-to-use libraries. Now with the recent innovations in
106
+ deep learning methods (such as recurrent neural networks and transformers, GANS, …),
107
+ keyword extraction can be improved. These new methods also focus on the semantics
108
+ and context of a document, which is quite an improvement.
109
+
110
+ ***** Prediction *****
111
+ ['Artificial Intelligence' 'GANS' 'Keyword extraction'
112
+ 'classical machine learning methods' 'context' 'deep learning methods'
113
+ 'keyword extraction' 'linguistics' 'recurrent neural networks'
114
+ 'semantics' 'statistics' 'text analysis' 'transformers']
115
+ ```
116
+
117
  ## πŸ“š Trainig Dataset
118
  ## πŸ‘·β€β™‚οΈ Training procedure
119