Spaces:

Ikala-allen
/

relation_extraction

Sleeping

App Files Files Community

Ikala-allen commited on Oct 4, 2023

Commit

ee82c0f

•

1 Parent(s): 0d196a8

Update README.md

Browse files

Files changed (1) hide show

README.md +17 -9

README.md CHANGED Viewed

@@ -5,11 +5,14 @@ datasets:
 tags:
 - evaluate
 - metric
-description: "This metric is used for evaluating the F1 accuracy of input references and predictions."
 sdk: gradio
 sdk_version: 3.19.1
 app_file: app.py
 pinned: false
 ---
 # Metric Card for relation_extraction evalutation
@@ -31,16 +34,14 @@ This metric takes 2 inputs, prediction and references(ground truth). Both of the
 ...        {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...    ]
 ... ]
 >>> predictions = [
 ...    [
 ...        {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...        {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...    ]
 ... ]
->>> evaluation_scores = module.compute(predictions=predictions, references=references)
->>> print(evaluation_scores)
 {'sell': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0}, 'ALL': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}}
 ```
@@ -126,10 +127,17 @@ Example with two or more prediction and reference:
 ```
 ## Limitations and Bias
-This metric has multiple known limitations:
 ## Citation
-*Cite the source where this metric was introduced.*
 ## Further References
-*Add any useful further references.*

 tags:
 - evaluate
 - metric
+description: >-
+  This metric is used for evaluating the F1 accuracy of input references and
+  predictions.
 sdk: gradio
 sdk_version: 3.19.1
 app_file: app.py
 pinned: false
+license: apache-2.0
 ---
 # Metric Card for relation_extraction evalutation
 ...        {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...    ]
 ... ]
 >>> predictions = [
 ...    [
 ...        {"head": "phipigments", "head_type": "product", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...        {"head": "tinadaviespigments", "head_type": "brand", "type": "sell", "tail": "國際認證之色乳", "tail_type": "product"},
 ...    ]
 ... ]
+>>> evaluation_scores = module.compute(predictions=predictions, references=references)
+>>> print(evaluation_scores)
 {'sell': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0}, 'ALL': {'tp': 1, 'fp': 1, 'fn': 1, 'p': 50.0, 'r': 50.0, 'f1': 50.0, 'Macro_f1': 50.0, 'Macro_p': 50.0, 'Macro_r': 50.0}}
 ```
 ```
 ## Limitations and Bias
+This metric has strict filter mechanism, if any of the prediction's entity names, such as head, head_type, type, tail, or tail_type, is not exactly the same as the reference one. It will count as fp or fn.
 ## Citation
+```bibtex
+@Paper{
+    author = {Bruno Taillé, Vincent Guigue, Geoffrey Scoutheeten, Patrick Gallinari},
+    title = {Let's Stop Incorrect Comparisons in End-to-end Relation Extraction!},
+    year = {2020},
+}
+*https://arxiv.org/abs/2009.10684*
+```
 ## Further References
+This evaluation metric implementation uses
+*https://github.com/btaille/sincere/blob/master/code/utils/evaluation.py*