JohnnyBoy00 commited on
Commit
99ad4c5
1 Parent(s): 4242bac

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -85
README.md CHANGED
@@ -1,73 +1,49 @@
1
  ---
2
- language: de
3
- datasets:
4
- - Short-Answer-Feedback/saf_legal_domain_german
5
  tags:
6
  - generated_from_trainer
7
- widget:
8
- - text: "Antwort: Kommt man einer besoderen Aufforderung nach und zeigt den Unfall unverzüglich an, besteht Versicherungsschutz falls ein Unfall eintritt. Lösung: Merkblatt für Arbeitslose, S. 77: Als Bezieher von Arbeitslosengeld sind Sie gegen Unfall versichert, während Sie einer besonderen Aufforderung nachkommen, eine Agentur für Arbeit oder andere Stelle aufzusuchen (z. B. zur ärztlichen Untersuchung, Vorstellung beim Arbeitgeber, Eingliederungsmaßnahme). Einen Unfall müssen Sie sofort Ihrer Agentur für Arbeit anzeigen. Frage: Inwieweit sind Sie während des Bezugs von Arbeitslosengeld gegen einen Unfall versichert und was sollten Sie nach einem Unfall tun?"
 
9
  ---
10
 
 
 
 
11
  # mbart-score-finetuned-saf-legal-domain
12
 
13
- This model is a fine-tuned version of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the [saf_legal_domain_german](https://huggingface.co/datasets/Short-Answer-Feedback/saf_legal_domain_german) dataset for Short Answer Feedback (SAF).
14
 
15
  ## Model description
16
 
17
- This model was built on top of [mBART](https://arxiv.org/abs/2001.08210), which is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages.
18
-
19
- It expects inputs in the following format:
20
- ```
21
- Antwort: [answer] Lösung: [reference_answer] Frage: [question]
22
- ```
23
-
24
- In the example above, `[answer]`, `[reference_answer]` and `[question]` should be replaced by the provided answer, the reference answer and the question to which they refer, respectively.
25
-
26
-
27
- The outputs are formatted as follows:
28
- ```
29
- [score] Feedback: [feedback]
30
- ```
31
-
32
- Hence, `[score]` will be a numeric value between 0 and 1, while `[feedback]` will be the textual feedback generated by the model according to the given answer.
33
 
34
  ## Intended uses & limitations
35
 
36
- This model is intended to be used for Short Answer Feedback generation in the domain of the German social law. Thus, it is not expected to have particularly good performance on sets of questions and answers out of this scope.
37
-
38
- It is important to acknowledge that the model underperforms when a question that was not seen during training is given as input for inference. In particular, it tends to classify most answers as being correct and does not provide relevant feedback in such cases. Nevertheless, this limitation could be partially overcome by extending the dataset with the desired question (and associated answers) and fine-tuning it for a few epochs on the new data.
39
 
40
  ## Training and evaluation data
41
 
42
- As mentioned previously, the model was trained on the [saf_legal_domain_german](https://huggingface.co/datasets/Short-Answer-Feedback/saf_legal_domain_german) dataset, which is divided into the following splits.
43
-
44
- | Split | Number of examples |
45
- | --------------------- | ------------------ |
46
- | train | 1596 |
47
- | validation | 400 |
48
- | test_unseen_answers | 221 |
49
- | test_unseen_questions | 275 |
50
-
51
- Evaluation was performed on the `test_unseen_answers` and `test_unseen_questions` splits.
52
 
53
  ## Training procedure
54
 
55
- The [Trainer API](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.Seq2SeqTrainer) was used to fine-tune the model. The code utilized for pre-processing and training was mostly adapted from the [summarization script](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization) made available by HuggingFace.
56
-
57
- Training was completed in a little over 1 hour on a GPU on Google Colab.
58
-
59
  ### Training hyperparameters
60
 
61
  The following hyperparameters were used during training:
62
- - num_epochs: 10
63
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
64
- - learning_rate: 5e-05
65
- - lr_scheduler_type: linear
66
  - train_batch_size: 1
67
- - gradient_accumulation_steps: 4
68
  - eval_batch_size: 4
69
- - mixed_precision_training: Native AMP
70
  - seed: 42
 
 
 
 
 
 
 
 
 
 
71
 
72
  ### Framework versions
73
 
@@ -75,43 +51,3 @@ The following hyperparameters were used during training:
75
  - Pytorch 1.13.1+cu116
76
  - Datasets 2.9.0
77
  - Tokenizers 0.13.2
78
-
79
- ## Evaluation results
80
-
81
- The generated feedback was evaluated through means of the [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu), [ROUGE-2](https://huggingface.co/spaces/evaluate-metric/rouge), [METEOR](https://huggingface.co/spaces/evaluate-metric/meteor), [BERTScore](https://huggingface.co/spaces/evaluate-metric/bertscore) metrics from HuggingFace, while the [Root Mean Squared Error](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error) loss from scikit-learn was used for evaluation of the predicted scores in relation to the golden label scores.
82
-
83
- The following results were achieved.
84
-
85
- | Split | SacreBLEU | ROUGE-2 | METEOR | BERTScore | RMSE |
86
- | --------------------- | :-------: | :-----: | :----: | :-------: | :---: |
87
- | test_unseen_answers | 33.7 | 37.2 | 50.7 | 45.0 | 0.264 |
88
- | test_unseen_questions | 2.9 | 5.7 | 17.0 | 10.8 | 0.331 |
89
-
90
- The script used to compute these metrics and perform evaluation can be found in the `evaluation.py` file in this repository.
91
-
92
- ## Usage
93
-
94
- The example below shows how the model can be applied to generate feedback to a given answer.
95
-
96
- ```python
97
- from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
98
-
99
- model = AutoModelForSeq2SeqLM.from_pretrained('Short-Answer-Feedback/mbart-score-finetuned-saf-legal-domain')
100
- tokenizer = AutoTokenizer.from_pretrained('Short-Answer-Feedback/mbart-score-finetuned-saf-legal-domain')
101
-
102
- example_input = 'Antwort: Kommt man einer besoderen Aufforderung nach und zeigt den Unfall unverzüglich an, besteht Versicherungsschutz falls ein Unfall eintritt. Lösung: Merkblatt für Arbeitslose, S. 77: Als Bezieher von Arbeitslosengeld sind Sie gegen Unfall versichert, während Sie einer besonderen Aufforderung nachkommen, eine Agentur für Arbeit oder andere Stelle aufzusuchen (z. B. zur ärztlichen Untersuchung, Vorstellung beim Arbeitgeber, Eingliederungsmaßnahme). Einen Unfall müssen Sie sofort Ihrer Agentur für Arbeit anzeigen. Frage: Inwieweit sind Sie während des Bezugs von Arbeitslosengeld gegen einen Unfall versichert und was sollten Sie nach einem Unfall tun?'
103
- inputs = tokenizer(example_input, max_length=256, padding='max_length', truncation=True, return_tensors='pt')
104
-
105
- generated_tokens = model.generate(
106
- inputs['input_ids'],
107
- attention_mask=inputs['attention_mask'],
108
- max_length=128
109
- )
110
- output = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
111
- ```
112
-
113
- The output produced by the model then looks as follows:
114
-
115
- ```
116
- 0.875 Feedback: Ihre Antwort ist richtig. Bitte beachten Sie, dass diese Aufforderung sowohl einen Termin bei der Agentur für Arbeit als auch die Vorstellung bei der Vorstellung bei der Agentur für Arbeit beinhalten kann - beispielsweise bei einer Vorstellung bei einer anderen Stelle. Bitte melden Sie einen entsprechenden Unfall sofort Ihrer Agentur für Arbeit.
117
- ```
 
1
  ---
 
 
 
2
  tags:
3
  - generated_from_trainer
4
+ model-index:
5
+ - name: mbart-score-finetuned-saf-legal-domain
6
+ results: []
7
  ---
8
 
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
  # mbart-score-finetuned-saf-legal-domain
13
 
14
+ This model is a fine-tuned version of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the None dataset.
15
 
16
  ## Model description
17
 
18
+ More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Intended uses & limitations
21
 
22
+ More information needed
 
 
23
 
24
  ## Training and evaluation data
25
 
26
+ More information needed
 
 
 
 
 
 
 
 
 
27
 
28
  ## Training procedure
29
 
 
 
 
 
30
  ### Training hyperparameters
31
 
32
  The following hyperparameters were used during training:
33
+ - learning_rate: 6e-05
 
 
 
34
  - train_batch_size: 1
 
35
  - eval_batch_size: 4
 
36
  - seed: 42
37
+ - gradient_accumulation_steps: 4
38
+ - total_train_batch_size: 4
39
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
+ - lr_scheduler_type: linear
41
+ - num_epochs: 9
42
+ - mixed_precision_training: Native AMP
43
+
44
+ ### Training results
45
+
46
+
47
 
48
  ### Framework versions
49
 
 
51
  - Pytorch 1.13.1+cu116
52
  - Datasets 2.9.0
53
  - Tokenizers 0.13.2