Ikala-allen commited on
Commit
619e946
1 Parent(s): 5199800

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -27
README.md CHANGED
@@ -20,7 +20,7 @@ This metric is used for evaluating the quality of relation extraction output. By
20
 
21
 
22
  ## Metric Description
23
- This metric can be used in relation extraction evaluation.
24
 
25
  ## How to Use
26
  This metric takes 2 inputs, prediction and references(ground truth). Both of them are a list of list of dictionary of entity's name and entity's type:
@@ -79,7 +79,7 @@ Output Example:
79
  Remind : Macro_f1、Macro_p、Macro_r、p、r、f1 are always a number between 0 and 1. And tp、fp、fn depend on how many data inputs.
80
 
81
  ### Examples
82
- Example1 : only one prediction and reference, mode = strict, only output ALL relation score
83
  ```python
84
  metric_path = "Ikala-allen/relation_extraction"
85
  module = evaluate.load(metric_path)
@@ -133,7 +133,7 @@ print(evaluation_scores)
133
  >>> {'tp': 2, 'fp': 0, 'fn': 1, 'p': 100.0, 'r': 66.66666666666667, 'f1': 80.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}
134
  ```
135
 
136
- Example3 : two or more prediction and reference, mode = boundaries, only output = False, output all relation type
137
  ```python
138
  metric_path = "Ikala-allen/relation_extraction"
139
  module = evaluate.load(metric_path)
@@ -168,34 +168,40 @@ print(evaluation_scores)
168
  >>> {'sell': {'tp': 3, 'fp': 1, 'fn': 0, 'p': 75.0, 'r': 100.0, 'f1': 85.71428571428571}, 'belongs_to': {'tp': 0, 'fp': 0, 'fn': 1, 'p': 0, 'r': 0, 'f1': 0}, 'ALL': {'tp': 3, 'fp': 1, 'fn': 1, 'p': 75.0, 'r': 75.0, 'f1': 75.0, 'Macro_f1': 42.857142857142854, 'Macro_p': 37.5, 'Macro_r': 50.0}}
169
  ```
170
 
171
- Example 4 with two or more prediction and reference:
172
  ```python
173
- >>> metric_path = "Ikala-allen/relation_extraction"
174
- >>> module = evaluate.load(metric_path)
175
- >>> references = [
176
- ... [
177
- ... {"head": "phip igments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
178
- ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
179
- ... ],[
180
- ... {'head': 'SABONTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'}
181
- ... ]
182
- ... ]
183
- >>> predictions = [
184
- ... [
185
- ... {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
186
- ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
187
- ... ],[
188
- ... {'head': 'SABONTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'},
189
- ... {'head': 'SNTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'}
190
- ... ]
191
- ... ]
192
- >>> evaluation_scores = module.compute(predictions=predictions, references=references)
193
- >>> print(evaluation_scores)
194
- {'sell': {'tp': 2, 'fp': 2, 'fn': 1, 'p': 50.0, 'r': 66.66666666666667, 'f1': 57.142857142857146}, 'ALL': {'tp': 2, 'fp': 2, 'fn': 1, 'p': 50.0, 'r': 66.66666666666667, 'f1': 57.142857142857146, 'Macro_f1': 57.142857142857146, 'Macro_p': 50.0, 'Macro_r': 66.66666666666667}}
 
 
 
 
 
195
  ```
196
 
197
  ## Limitations and Bias
198
- This metric has strict filter mechanism, if any of the prediction's entity names, such as head, head_type, type, tail, or tail_type, is not exactly the same as the reference one. It will count as fp or fn.
 
199
 
200
  ## Citation
201
  ```bibtex
 
20
 
21
 
22
  ## Metric Description
23
+ This metric can be used in relation extraction evaluation.
24
 
25
  ## How to Use
26
  This metric takes 2 inputs, prediction and references(ground truth). Both of them are a list of list of dictionary of entity's name and entity's type:
 
79
  Remind : Macro_f1、Macro_p、Macro_r、p、r、f1 are always a number between 0 and 1. And tp、fp、fn depend on how many data inputs.
80
 
81
  ### Examples
82
+ Example1 : only one prediction and reference, mode = strict, only output ALL relation score
83
  ```python
84
  metric_path = "Ikala-allen/relation_extraction"
85
  module = evaluate.load(metric_path)
 
133
  >>> {'tp': 2, 'fp': 0, 'fn': 1, 'p': 100.0, 'r': 66.66666666666667, 'f1': 80.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}
134
  ```
135
 
136
+ Example3 : two or more prediction and reference, mode = boundaries, only output = False, output all relation type score
137
  ```python
138
  metric_path = "Ikala-allen/relation_extraction"
139
  module = evaluate.load(metric_path)
 
168
  >>> {'sell': {'tp': 3, 'fp': 1, 'fn': 0, 'p': 75.0, 'r': 100.0, 'f1': 85.71428571428571}, 'belongs_to': {'tp': 0, 'fp': 0, 'fn': 1, 'p': 0, 'r': 0, 'f1': 0}, 'ALL': {'tp': 3, 'fp': 1, 'fn': 1, 'p': 75.0, 'r': 75.0, 'f1': 75.0, 'Macro_f1': 42.857142857142854, 'Macro_p': 37.5, 'Macro_r': 50.0}}
169
  ```
170
 
171
+ Example 4 : two or more prediction and reference, mode = boundaries, only output = False, only output ALL relation score, relation_types = ["belongs_to"], only consider belongs_to type score
172
  ```python
173
+ metric_path = "Ikala-allen/relation_extraction"
174
+ module = evaluate.load(metric_path)
175
+ references = [
176
+ [
177
+ {"head": "phipigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
178
+ {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
179
+ ],
180
+ [
181
+ {'head': 'SABONTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'},
182
+ {'head': 'A醛賦活緊緻精華', 'tail': 'Serum', 'head_type': 'product', 'tail_type': 'category', 'type': 'belongs_to'},
183
+ ]
184
+ ]
185
+
186
+ # Example references (ground truth)
187
+ predictions = [
188
+ [
189
+ {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
190
+ {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
191
+ ],
192
+ [
193
+ {'head': 'SABONTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'},
194
+ {'head': 'SNTAIWAN', 'tail': '大馬士革玫瑰有機光燦系列', 'head_type': 'brand', 'tail_type': 'product', 'type': 'sell'}
195
+ ]
196
+ ]
197
+ evaluation_scores = module.compute(predictions=predictions, references=references, mode = "boundaries", only_all=False,relation_types = ["belongs_to"])
198
+ print(evaluation_scores)
199
+ >>> {'belongs_to': {'tp': 0, 'fp': 0, 'fn': 1, 'p': 0, 'r': 0, 'f1': 0}, 'ALL': {'tp': 0, 'fp': 0, 'fn': 1, 'p': 0, 'r': 0, 'f1': 0, 'Macro_f1': 0.0, 'Macro_p': 0.0, 'Macro_r': 0.0}}
200
  ```
201
 
202
  ## Limitations and Bias
203
+ This metric has strict and boundaries mode, also can select relation_types for different type evaluation. Make sure to select suitable evaluation parameters. F1 score may be totally different.
204
+ Prediction and reference entity_name should be exactly the same regardless of case and spaces. If prediction is not exactly the same as the reference one. It will count as fp or fn.
205
 
206
  ## Citation
207
  ```bibtex