Ikala-allen commited on
Commit
ee82c0f
1 Parent(s): 0d196a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -9
README.md CHANGED
@@ -5,11 +5,14 @@ datasets:
5
  tags:
6
  - evaluate
7
  - metric
8
- description: "This metric is used for evaluating the F1 accuracy of input references and predictions."
 
 
9
  sdk: gradio
10
  sdk_version: 3.19.1
11
  app_file: app.py
12
  pinned: false
 
13
  ---
14
 
15
  # Metric Card for relation_extraction evalutation
@@ -31,16 +34,14 @@ This metric takes 2 inputs, prediction and references(ground truth). Both of the
31
  ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
32
  ... ]
33
  ... ]
34
-
35
  >>> predictions = [
36
  ... [
37
  ... {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
38
  ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
39
  ... ]
40
  ... ]
41
-
42
- >>> evaluation_scores = module.compute(predictions=predictions, references=references)
43
- >>> print(evaluation_scores)
44
  {'sell': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0}, 'ALL': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}}
45
  ```
46
 
@@ -126,10 +127,17 @@ Example with two or more prediction and reference:
126
  ```
127
 
128
  ## Limitations and Bias
129
- This metric has multiple known limitations:
130
 
131
  ## Citation
132
- *Cite the source where this metric was introduced.*
133
-
 
 
 
 
 
 
134
  ## Further References
135
- *Add any useful further references.*
 
 
5
  tags:
6
  - evaluate
7
  - metric
8
+ description: >-
9
+ This metric is used for evaluating the F1 accuracy of input references and
10
+ predictions.
11
  sdk: gradio
12
  sdk_version: 3.19.1
13
  app_file: app.py
14
  pinned: false
15
+ license: apache-2.0
16
  ---
17
 
18
  # Metric Card for relation_extraction evalutation
 
34
  ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
35
  ... ]
36
  ... ]
 
37
  >>> predictions = [
38
  ... [
39
  ... {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
40
  ... {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
41
  ... ]
42
  ... ]
43
+ >>> evaluation_scores = module.compute(predictions=predictions, references=references)
44
+ >>> print(evaluation_scores)
 
45
  {'sell': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0}, 'ALL': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}}
46
  ```
47
 
 
127
  ```
128
 
129
  ## Limitations and Bias
130
+ This metric has strict filter mechanism, if any of the prediction's entity names, such as head, head_type, type, tail, or tail_type, is not exactly the same as the reference one. It will count as fp or fn.
131
 
132
  ## Citation
133
+ ```bibtex
134
+ @Paper{
135
+ author = {Bruno Taillé, Vincent Guigue, Geoffrey Scoutheeten, Patrick Gallinari},
136
+ title = {Let's Stop Incorrect Comparisons in End-to-end Relation Extraction!},
137
+ year = {2020},
138
+ }
139
+ *https://arxiv.org/abs/2009.10684*
140
+ ```
141
  ## Further References
142
+ This evaluation metric implementation uses
143
+ *https://github.com/btaille/sincere/blob/master/code/utils/evaluation.py*