Upload folder using huggingface_hub

#2
model_hyperparameters.json ADDED
The diff for this file is too large to render. See raw diff
 
model_weights/README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ ---
4
+ ## Training procedure
5
+
6
+
7
+ The following `bitsandbytes` quantization config was used during training:
8
+ - quant_method: bitsandbytes
9
+ - load_in_8bit: False
10
+ - load_in_4bit: True
11
+ - llm_int8_threshold: 6.0
12
+ - llm_int8_skip_modules: None
13
+ - llm_int8_enable_fp32_cpu_offload: False
14
+ - llm_int8_has_fp16_weight: False
15
+ - bnb_4bit_quant_type: nf4
16
+ - bnb_4bit_use_double_quant: True
17
+ - bnb_4bit_compute_dtype: float16
18
+
19
+ The following `bitsandbytes` quantization config was used during training:
20
+ - quant_method: bitsandbytes
21
+ - load_in_8bit: False
22
+ - load_in_4bit: True
23
+ - llm_int8_threshold: 6.0
24
+ - llm_int8_skip_modules: None
25
+ - llm_int8_enable_fp32_cpu_offload: False
26
+ - llm_int8_has_fp16_weight: False
27
+ - bnb_4bit_quant_type: nf4
28
+ - bnb_4bit_use_double_quant: True
29
+ - bnb_4bit_compute_dtype: float16
30
+ ### Framework versions
31
+
32
+ - PEFT 0.5.0
33
+
34
+ - PEFT 0.5.0
model_weights/adapter_config.json ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": null,
3
+ "base_model_name_or_path": "FlagAlpha/Llama2-Chinese-7b-Chat",
4
+ "bias": "none",
5
+ "fan_in_fan_out": false,
6
+ "inference_mode": true,
7
+ "init_lora_weights": true,
8
+ "layers_pattern": null,
9
+ "layers_to_transform": null,
10
+ "lora_alpha": 16,
11
+ "lora_dropout": 0.05,
12
+ "modules_to_save": null,
13
+ "peft_type": "LORA",
14
+ "r": 8,
15
+ "revision": null,
16
+ "target_modules": [
17
+ "q_proj",
18
+ "v_proj"
19
+ ],
20
+ "task_type": "CAUSAL_LM"
21
+ }
model_weights/adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2e7ea9cb6a39f3a02a81f35b01328d22b3869937357c24cb97de8fb6eae7eb6
3
+ size 16822989
training_checkpoints/best.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2084f34cf3fb65da11a40348baf7b9b6132dacf5f0594a60e88ec72ffcd1a690
3
+ size 50543611
training_progress.json ADDED
@@ -0,0 +1,2451 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "batch_size": 2,
3
+ "best_eval_metric_checkpoint_number": 7,
4
+ "best_eval_metric_epoch": 7,
5
+ "best_eval_metric_steps": 19390,
6
+ "best_eval_metric_value": 0.028326265513896942,
7
+ "best_eval_test_metrics": {
8
+ "combined": {
9
+ "loss": 0.02837251126766205
10
+ },
11
+ "output": {
12
+ "bleu": 0.2571483254432678,
13
+ "char_error_rate": 2.8023552894592285,
14
+ "loss": 0.02837251126766205,
15
+ "next_token_perplexity": 11895.607421875,
16
+ "perplexity": 31894.921875,
17
+ "rouge1_fmeasure": 0.5043888688087463,
18
+ "rouge1_precision": 0.35199737548828125,
19
+ "rouge1_recall": 0.9673399329185486,
20
+ "rouge2_fmeasure": 0.48816248774528503,
21
+ "rouge2_precision": 0.3395858705043793,
22
+ "rouge2_recall": 0.9494638442993164,
23
+ "rougeL_fmeasure": 0.5036171078681946,
24
+ "rougeL_precision": 0.35143783688545227,
25
+ "rougeL_recall": 0.9660221934318542,
26
+ "rougeLsum_fmeasure": 0.5034310817718506,
27
+ "rougeLsum_precision": 0.35130569338798523,
28
+ "rougeLsum_recall": 0.9656680822372437,
29
+ "sequence_accuracy": 0.0,
30
+ "token_accuracy": 0.0004706201143562794,
31
+ "word_error_rate": 2.537461757659912
32
+ }
33
+ },
34
+ "best_eval_train_metrics": {
35
+ "combined": {
36
+ "loss": 0.010627036914229393
37
+ },
38
+ "output": {
39
+ "bleu": 0.19371718168258667,
40
+ "char_error_rate": 3.023742914199829,
41
+ "loss": 0.010627036914229393,
42
+ "next_token_perplexity": 11850.654296875,
43
+ "perplexity": 31999.515625,
44
+ "rouge1_fmeasure": 0.5099087953567505,
45
+ "rouge1_precision": 0.34651726484298706,
46
+ "rouge1_recall": 1.0,
47
+ "rouge2_fmeasure": 0.5034307837486267,
48
+ "rouge2_precision": 0.3408345580101013,
49
+ "rouge2_recall": 1.0,
50
+ "rougeL_fmeasure": 0.5099087953567505,
51
+ "rougeL_precision": 0.34651726484298706,
52
+ "rougeL_recall": 1.0,
53
+ "rougeLsum_fmeasure": 0.5099087953567505,
54
+ "rougeLsum_precision": 0.34651726484298706,
55
+ "rougeLsum_recall": 1.0,
56
+ "sequence_accuracy": 0.0,
57
+ "token_accuracy": 0.0,
58
+ "word_error_rate": 2.925373077392578
59
+ }
60
+ },
61
+ "best_eval_validation_metrics": {
62
+ "combined": {
63
+ "loss": 0.028326265513896942
64
+ },
65
+ "output": {
66
+ "bleu": 0.2575768232345581,
67
+ "char_error_rate": 2.8845763206481934,
68
+ "loss": 0.028326265513896942,
69
+ "next_token_perplexity": 11894.1640625,
70
+ "perplexity": 31888.74609375,
71
+ "rouge1_fmeasure": 0.5003785490989685,
72
+ "rouge1_precision": 0.3479803204536438,
73
+ "rouge1_recall": 0.9642844796180725,
74
+ "rouge2_fmeasure": 0.4844965636730194,
75
+ "rouge2_precision": 0.3358590602874756,
76
+ "rouge2_recall": 0.946923017501831,
77
+ "rougeL_fmeasure": 0.49971580505371094,
78
+ "rougeL_precision": 0.34750500321388245,
79
+ "rougeL_recall": 0.9631230235099792,
80
+ "rougeLsum_fmeasure": 0.499582976102829,
81
+ "rougeLsum_precision": 0.34741097688674927,
82
+ "rougeLsum_recall": 0.9628673195838928,
83
+ "sequence_accuracy": 0.0,
84
+ "token_accuracy": 0.00046722154365852475,
85
+ "word_error_rate": 2.64115047454834
86
+ }
87
+ },
88
+ "best_increase_batch_size_eval_metric": Infinity,
89
+ "checkpoint_number": 7,
90
+ "epoch": 8,
91
+ "last_improvement_steps": 0,
92
+ "last_increase_batch_size": 0,
93
+ "last_increase_batch_size_eval_metric_improvement": 0,
94
+ "last_increase_batch_size_steps": 0,
95
+ "last_learning_rate_reduction": 0,
96
+ "last_learning_rate_reduction_steps": 0,
97
+ "learning_rate": 0.0001,
98
+ "num_increases_batch_size": 0,
99
+ "num_reductions_learning_rate": 0,
100
+ "steps": 19390,
101
+ "test_metrics": {
102
+ "combined": {
103
+ "loss": [
104
+ [
105
+ 1,
106
+ 2770,
107
+ 0.056570280343294144
108
+ ],
109
+ [
110
+ 2,
111
+ 5540,
112
+ 0.03862074762582779
113
+ ],
114
+ [
115
+ 3,
116
+ 8310,
117
+ 0.0322396382689476
118
+ ],
119
+ [
120
+ 4,
121
+ 11080,
122
+ 0.029598653316497803
123
+ ],
124
+ [
125
+ 5,
126
+ 13850,
127
+ 0.029046066105365753
128
+ ],
129
+ [
130
+ 6,
131
+ 16620,
132
+ 0.029180288314819336
133
+ ],
134
+ [
135
+ 7,
136
+ 19390,
137
+ 0.02837251126766205
138
+ ]
139
+ ]
140
+ },
141
+ "output": {
142
+ "bleu": [
143
+ [
144
+ 1,
145
+ 2770,
146
+ 0.2881736159324646
147
+ ],
148
+ [
149
+ 2,
150
+ 5540,
151
+ 0.30035901069641113
152
+ ],
153
+ [
154
+ 3,
155
+ 8310,
156
+ 0.29325926303863525
157
+ ],
158
+ [
159
+ 4,
160
+ 11080,
161
+ 0.2912749648094177
162
+ ],
163
+ [
164
+ 5,
165
+ 13850,
166
+ 0.29473328590393066
167
+ ],
168
+ [
169
+ 6,
170
+ 16620,
171
+ 0.2825150489807129
172
+ ],
173
+ [
174
+ 7,
175
+ 19390,
176
+ 0.2571483254432678
177
+ ]
178
+ ],
179
+ "char_error_rate": [
180
+ [
181
+ 1,
182
+ 2770,
183
+ 3.7703747749328613
184
+ ],
185
+ [
186
+ 2,
187
+ 5540,
188
+ 2.7834644317626953
189
+ ],
190
+ [
191
+ 3,
192
+ 8310,
193
+ 2.8107423782348633
194
+ ],
195
+ [
196
+ 4,
197
+ 11080,
198
+ 2.7512333393096924
199
+ ],
200
+ [
201
+ 5,
202
+ 13850,
203
+ 2.7991256713867188
204
+ ],
205
+ [
206
+ 6,
207
+ 16620,
208
+ 2.74367618560791
209
+ ],
210
+ [
211
+ 7,
212
+ 19390,
213
+ 2.8023552894592285
214
+ ]
215
+ ],
216
+ "loss": [
217
+ [
218
+ 1,
219
+ 2770,
220
+ 0.056570280343294144
221
+ ],
222
+ [
223
+ 2,
224
+ 5540,
225
+ 0.03862074762582779
226
+ ],
227
+ [
228
+ 3,
229
+ 8310,
230
+ 0.0322396382689476
231
+ ],
232
+ [
233
+ 4,
234
+ 11080,
235
+ 0.029598653316497803
236
+ ],
237
+ [
238
+ 5,
239
+ 13850,
240
+ 0.029046066105365753
241
+ ],
242
+ [
243
+ 6,
244
+ 16620,
245
+ 0.029180288314819336
246
+ ],
247
+ [
248
+ 7,
249
+ 19390,
250
+ 0.02837251126766205
251
+ ]
252
+ ],
253
+ "next_token_perplexity": [
254
+ [
255
+ 1,
256
+ 2770,
257
+ 12036.4189453125
258
+ ],
259
+ [
260
+ 2,
261
+ 5540,
262
+ 11973.7529296875
263
+ ],
264
+ [
265
+ 3,
266
+ 8310,
267
+ 11931.3291015625
268
+ ],
269
+ [
270
+ 4,
271
+ 11080,
272
+ 11918.92578125
273
+ ],
274
+ [
275
+ 5,
276
+ 13850,
277
+ 11906.0029296875
278
+ ],
279
+ [
280
+ 6,
281
+ 16620,
282
+ 11907.7744140625
283
+ ],
284
+ [
285
+ 7,
286
+ 19390,
287
+ 11895.607421875
288
+ ]
289
+ ],
290
+ "perplexity": [
291
+ [
292
+ 1,
293
+ 2770,
294
+ 31891.08984375
295
+ ],
296
+ [
297
+ 2,
298
+ 5540,
299
+ 31882.876953125
300
+ ],
301
+ [
302
+ 3,
303
+ 8310,
304
+ 31891.08984375
305
+ ],
306
+ [
307
+ 4,
308
+ 11080,
309
+ 31888.59375
310
+ ],
311
+ [
312
+ 5,
313
+ 13850,
314
+ 31894.5859375
315
+ ],
316
+ [
317
+ 6,
318
+ 16620,
319
+ 31886.49609375
320
+ ],
321
+ [
322
+ 7,
323
+ 19390,
324
+ 31894.921875
325
+ ]
326
+ ],
327
+ "rouge1_fmeasure": [
328
+ [
329
+ 1,
330
+ 2770,
331
+ 0.45583832263946533
332
+ ],
333
+ [
334
+ 2,
335
+ 5540,
336
+ 0.46054157614707947
337
+ ],
338
+ [
339
+ 3,
340
+ 8310,
341
+ 0.46443498134613037
342
+ ],
343
+ [
344
+ 4,
345
+ 11080,
346
+ 0.4832415282726288
347
+ ],
348
+ [
349
+ 5,
350
+ 13850,
351
+ 0.4780387878417969
352
+ ],
353
+ [
354
+ 6,
355
+ 16620,
356
+ 0.49345338344573975
357
+ ],
358
+ [
359
+ 7,
360
+ 19390,
361
+ 0.5043888688087463
362
+ ]
363
+ ],
364
+ "rouge1_precision": [
365
+ [
366
+ 1,
367
+ 2770,
368
+ 0.31005945801734924
369
+ ],
370
+ [
371
+ 2,
372
+ 5540,
373
+ 0.31317880749702454
374
+ ],
375
+ [
376
+ 3,
377
+ 8310,
378
+ 0.3162654638290405
379
+ ],
380
+ [
381
+ 4,
382
+ 11080,
383
+ 0.33289361000061035
384
+ ],
385
+ [
386
+ 5,
387
+ 13850,
388
+ 0.3283507823944092
389
+ ],
390
+ [
391
+ 6,
392
+ 16620,
393
+ 0.3420336842536926
394
+ ],
395
+ [
396
+ 7,
397
+ 19390,
398
+ 0.35199737548828125
399
+ ]
400
+ ],
401
+ "rouge1_recall": [
402
+ [
403
+ 1,
404
+ 2770,
405
+ 0.9494807720184326
406
+ ],
407
+ [
408
+ 2,
409
+ 5540,
410
+ 0.9610881805419922
411
+ ],
412
+ [
413
+ 3,
414
+ 8310,
415
+ 0.9647314548492432
416
+ ],
417
+ [
418
+ 4,
419
+ 11080,
420
+ 0.9663017988204956
421
+ ],
422
+ [
423
+ 5,
424
+ 13850,
425
+ 0.9644144177436829
426
+ ],
427
+ [
428
+ 6,
429
+ 16620,
430
+ 0.9666685461997986
431
+ ],
432
+ [
433
+ 7,
434
+ 19390,
435
+ 0.9673399329185486
436
+ ]
437
+ ],
438
+ "rouge2_fmeasure": [
439
+ [
440
+ 1,
441
+ 2770,
442
+ 0.4300532042980194
443
+ ],
444
+ [
445
+ 2,
446
+ 5540,
447
+ 0.44140955805778503
448
+ ],
449
+ [
450
+ 3,
451
+ 8310,
452
+ 0.4470669627189636
453
+ ],
454
+ [
455
+ 4,
456
+ 11080,
457
+ 0.46725624799728394
458
+ ],
459
+ [
460
+ 5,
461
+ 13850,
462
+ 0.4617525637149811
463
+ ],
464
+ [
465
+ 6,
466
+ 16620,
467
+ 0.47793200612068176
468
+ ],
469
+ [
470
+ 7,
471
+ 19390,
472
+ 0.48816248774528503
473
+ ]
474
+ ],
475
+ "rouge2_precision": [
476
+ [
477
+ 1,
478
+ 2770,
479
+ 0.29176947474479675
480
+ ],
481
+ [
482
+ 2,
483
+ 5540,
484
+ 0.29934147000312805
485
+ ],
486
+ [
487
+ 3,
488
+ 8310,
489
+ 0.3036028742790222
490
+ ],
491
+ [
492
+ 4,
493
+ 11080,
494
+ 0.32093188166618347
495
+ ],
496
+ [
497
+ 5,
498
+ 13850,
499
+ 0.31624865531921387
500
+ ],
501
+ [
502
+ 6,
503
+ 16620,
504
+ 0.33026161789894104
505
+ ],
506
+ [
507
+ 7,
508
+ 19390,
509
+ 0.3395858705043793
510
+ ]
511
+ ],
512
+ "rouge2_recall": [
513
+ [
514
+ 1,
515
+ 2770,
516
+ 0.9086052179336548
517
+ ],
518
+ [
519
+ 2,
520
+ 5540,
521
+ 0.9347748756408691
522
+ ],
523
+ [
524
+ 3,
525
+ 8310,
526
+ 0.9421030879020691
527
+ ],
528
+ [
529
+ 4,
530
+ 11080,
531
+ 0.9477611780166626
532
+ ],
533
+ [
534
+ 5,
535
+ 13850,
536
+ 0.9448394179344177
537
+ ],
538
+ [
539
+ 6,
540
+ 16620,
541
+ 0.9496001601219177
542
+ ],
543
+ [
544
+ 7,
545
+ 19390,
546
+ 0.9494638442993164
547
+ ]
548
+ ],
549
+ "rougeL_fmeasure": [
550
+ [
551
+ 1,
552
+ 2770,
553
+ 0.45452964305877686
554
+ ],
555
+ [
556
+ 2,
557
+ 5540,
558
+ 0.45963969826698303
559
+ ],
560
+ [
561
+ 3,
562
+ 8310,
563
+ 0.4637087881565094
564
+ ],
565
+ [
566
+ 4,
567
+ 11080,
568
+ 0.48262497782707214
569
+ ],
570
+ [
571
+ 5,
572
+ 13850,
573
+ 0.4772895872592926
574
+ ],
575
+ [
576
+ 6,
577
+ 16620,
578
+ 0.4928056597709656
579
+ ],
580
+ [
581
+ 7,
582
+ 19390,
583
+ 0.5036171078681946
584
+ ]
585
+ ],
586
+ "rougeL_precision": [
587
+ [
588
+ 1,
589
+ 2770,
590
+ 0.30914732813835144
591
+ ],
592
+ [
593
+ 2,
594
+ 5540,
595
+ 0.3125472366809845
596
+ ],
597
+ [
598
+ 3,
599
+ 8310,
600
+ 0.3157593905925751
601
+ ],
602
+ [
603
+ 4,
604
+ 11080,
605
+ 0.33245277404785156
606
+ ],
607
+ [
608
+ 5,
609
+ 13850,
610
+ 0.3278239071369171
611
+ ],
612
+ [
613
+ 6,
614
+ 16620,
615
+ 0.3415752649307251
616
+ ],
617
+ [
618
+ 7,
619
+ 19390,
620
+ 0.35143783688545227
621
+ ]
622
+ ],
623
+ "rougeL_recall": [
624
+ [
625
+ 1,
626
+ 2770,
627
+ 0.9469588994979858
628
+ ],
629
+ [
630
+ 2,
631
+ 5540,
632
+ 0.9593668580055237
633
+ ],
634
+ [
635
+ 3,
636
+ 8310,
637
+ 0.963313639163971
638
+ ],
639
+ [
640
+ 4,
641
+ 11080,
642
+ 0.9652055501937866
643
+ ],
644
+ [
645
+ 5,
646
+ 13850,
647
+ 0.9630056023597717
648
+ ],
649
+ [
650
+ 6,
651
+ 16620,
652
+ 0.9654869437217712
653
+ ],
654
+ [
655
+ 7,
656
+ 19390,
657
+ 0.9660221934318542
658
+ ]
659
+ ],
660
+ "rougeLsum_fmeasure": [
661
+ [
662
+ 1,
663
+ 2770,
664
+ 0.4544520974159241
665
+ ],
666
+ [
667
+ 2,
668
+ 5540,
669
+ 0.4593540132045746
670
+ ],
671
+ [
672
+ 3,
673
+ 8310,
674
+ 0.463616281747818
675
+ ],
676
+ [
677
+ 4,
678
+ 11080,
679
+ 0.4826125502586365
680
+ ],
681
+ [
682
+ 5,
683
+ 13850,
684
+ 0.47729507088661194
685
+ ],
686
+ [
687
+ 6,
688
+ 16620,
689
+ 0.4928112030029297
690
+ ],
691
+ [
692
+ 7,
693
+ 19390,
694
+ 0.5034310817718506
695
+ ]
696
+ ],
697
+ "rougeLsum_precision": [
698
+ [
699
+ 1,
700
+ 2770,
701
+ 0.30910027027130127
702
+ ],
703
+ [
704
+ 2,
705
+ 5540,
706
+ 0.31235170364379883
707
+ ],
708
+ [
709
+ 3,
710
+ 8310,
711
+ 0.31568989157676697
712
+ ],
713
+ [
714
+ 4,
715
+ 11080,
716
+ 0.33244553208351135
717
+ ],
718
+ [
719
+ 5,
720
+ 13850,
721
+ 0.3278290927410126
722
+ ],
723
+ [
724
+ 6,
725
+ 16620,
726
+ 0.3415812849998474
727
+ ],
728
+ [
729
+ 7,
730
+ 19390,
731
+ 0.35130569338798523
732
+ ]
733
+ ],
734
+ "rougeLsum_recall": [
735
+ [
736
+ 1,
737
+ 2770,
738
+ 0.9467605948448181
739
+ ],
740
+ [
741
+ 2,
742
+ 5540,
743
+ 0.9587725400924683
744
+ ],
745
+ [
746
+ 3,
747
+ 8310,
748
+ 0.9631702303886414
749
+ ],
750
+ [
751
+ 4,
752
+ 11080,
753
+ 0.9651756882667542
754
+ ],
755
+ [
756
+ 5,
757
+ 13850,
758
+ 0.9630022644996643
759
+ ],
760
+ [
761
+ 6,
762
+ 16620,
763
+ 0.9654796719551086
764
+ ],
765
+ [
766
+ 7,
767
+ 19390,
768
+ 0.9656680822372437
769
+ ]
770
+ ],
771
+ "sequence_accuracy": [
772
+ [
773
+ 1,
774
+ 2770,
775
+ 0.0
776
+ ],
777
+ [
778
+ 2,
779
+ 5540,
780
+ 0.0
781
+ ],
782
+ [
783
+ 3,
784
+ 8310,
785
+ 0.0
786
+ ],
787
+ [
788
+ 4,
789
+ 11080,
790
+ 0.0
791
+ ],
792
+ [
793
+ 5,
794
+ 13850,
795
+ 0.0
796
+ ],
797
+ [
798
+ 6,
799
+ 16620,
800
+ 0.0
801
+ ],
802
+ [
803
+ 7,
804
+ 19390,
805
+ 0.0
806
+ ]
807
+ ],
808
+ "token_accuracy": [
809
+ [
810
+ 1,
811
+ 2770,
812
+ 0.00046411342918872833
813
+ ],
814
+ [
815
+ 2,
816
+ 5540,
817
+ 0.00046866878983564675
818
+ ],
819
+ [
820
+ 3,
821
+ 8310,
822
+ 0.00046727192238904536
823
+ ],
824
+ [
825
+ 4,
826
+ 11080,
827
+ 0.0004687990585807711
828
+ ],
829
+ [
830
+ 5,
831
+ 13850,
832
+ 0.0004691674548666924
833
+ ],
834
+ [
835
+ 6,
836
+ 16620,
837
+ 0.0004687117470894009
838
+ ],
839
+ [
840
+ 7,
841
+ 19390,
842
+ 0.0004706201143562794
843
+ ]
844
+ ],
845
+ "word_error_rate": [
846
+ [
847
+ 1,
848
+ 2770,
849
+ 1.9269943237304688
850
+ ],
851
+ [
852
+ 2,
853
+ 5540,
854
+ 1.9473299980163574
855
+ ],
856
+ [
857
+ 3,
858
+ 8310,
859
+ 2.0793614387512207
860
+ ],
861
+ [
862
+ 4,
863
+ 11080,
864
+ 2.1502275466918945
865
+ ],
866
+ [
867
+ 5,
868
+ 13850,
869
+ 2.092580556869507
870
+ ],
871
+ [
872
+ 6,
873
+ 16620,
874
+ 2.2052907943725586
875
+ ],
876
+ [
877
+ 7,
878
+ 19390,
879
+ 2.537461757659912
880
+ ]
881
+ ]
882
+ }
883
+ },
884
+ "train_metrics": {
885
+ "combined": {
886
+ "loss": [
887
+ [
888
+ 1,
889
+ 2770,
890
+ 0.16293130815029144
891
+ ],
892
+ [
893
+ 2,
894
+ 5540,
895
+ 0.012954902835190296
896
+ ],
897
+ [
898
+ 3,
899
+ 8310,
900
+ 0.03785233944654465
901
+ ],
902
+ [
903
+ 4,
904
+ 11080,
905
+ 0.020756859332323074
906
+ ],
907
+ [
908
+ 5,
909
+ 13850,
910
+ 0.011303349398076534
911
+ ],
912
+ [
913
+ 6,
914
+ 16620,
915
+ 0.012777533382177353
916
+ ],
917
+ [
918
+ 7,
919
+ 19390,
920
+ 0.010627036914229393
921
+ ]
922
+ ]
923
+ },
924
+ "output": {
925
+ "bleu": [
926
+ [
927
+ 1,
928
+ 2770,
929
+ 0.20123623311519623
930
+ ],
931
+ [
932
+ 2,
933
+ 5540,
934
+ 0.3794978857040405
935
+ ],
936
+ [
937
+ 3,
938
+ 8310,
939
+ 0.2412903904914856
940
+ ],
941
+ [
942
+ 4,
943
+ 11080,
944
+ 0.36655688285827637
945
+ ],
946
+ [
947
+ 5,
948
+ 13850,
949
+ 0.1585085093975067
950
+ ],
951
+ [
952
+ 6,
953
+ 16620,
954
+ 0.2643570303916931
955
+ ],
956
+ [
957
+ 7,
958
+ 19390,
959
+ 0.19371718168258667
960
+ ]
961
+ ],
962
+ "char_error_rate": [
963
+ [
964
+ 1,
965
+ 2770,
966
+ 7.55555534362793
967
+ ],
968
+ [
969
+ 2,
970
+ 5540,
971
+ 2.9573256969451904
972
+ ],
973
+ [
974
+ 3,
975
+ 8310,
976
+ 4.503154754638672
977
+ ],
978
+ [
979
+ 4,
980
+ 11080,
981
+ 3.002485513687134
982
+ ],
983
+ [
984
+ 5,
985
+ 13850,
986
+ 2.344075918197632
987
+ ],
988
+ [
989
+ 6,
990
+ 16620,
991
+ 4.294294357299805
992
+ ],
993
+ [
994
+ 7,
995
+ 19390,
996
+ 3.023742914199829
997
+ ]
998
+ ],
999
+ "loss": [
1000
+ [
1001
+ 1,
1002
+ 2770,
1003
+ 0.16293130815029144
1004
+ ],
1005
+ [
1006
+ 2,
1007
+ 5540,
1008
+ 0.012954902835190296
1009
+ ],
1010
+ [
1011
+ 3,
1012
+ 8310,
1013
+ 0.03785233944654465
1014
+ ],
1015
+ [
1016
+ 4,
1017
+ 11080,
1018
+ 0.020756859332323074
1019
+ ],
1020
+ [
1021
+ 5,
1022
+ 13850,
1023
+ 0.011303349398076534
1024
+ ],
1025
+ [
1026
+ 6,
1027
+ 16620,
1028
+ 0.012777533382177353
1029
+ ],
1030
+ [
1031
+ 7,
1032
+ 19390,
1033
+ 0.010627036914229393
1034
+ ]
1035
+ ],
1036
+ "next_token_perplexity": [
1037
+ [
1038
+ 1,
1039
+ 2770,
1040
+ 12304.353515625
1041
+ ],
1042
+ [
1043
+ 2,
1044
+ 5540,
1045
+ 11869.2724609375
1046
+ ],
1047
+ [
1048
+ 3,
1049
+ 8310,
1050
+ 11982.583984375
1051
+ ],
1052
+ [
1053
+ 4,
1054
+ 11080,
1055
+ 11896.8330078125
1056
+ ],
1057
+ [
1058
+ 5,
1059
+ 13850,
1060
+ 11849.265625
1061
+ ],
1062
+ [
1063
+ 6,
1064
+ 16620,
1065
+ 11852.1123046875
1066
+ ],
1067
+ [
1068
+ 7,
1069
+ 19390,
1070
+ 11850.654296875
1071
+ ]
1072
+ ],
1073
+ "perplexity": [
1074
+ [
1075
+ 1,
1076
+ 2770,
1077
+ 31993.228515625
1078
+ ],
1079
+ [
1080
+ 2,
1081
+ 5540,
1082
+ 31764.84765625
1083
+ ],
1084
+ [
1085
+ 3,
1086
+ 8310,
1087
+ 31999.63671875
1088
+ ],
1089
+ [
1090
+ 4,
1091
+ 11080,
1092
+ 31889.169921875
1093
+ ],
1094
+ [
1095
+ 5,
1096
+ 13850,
1097
+ 31996.4921875
1098
+ ],
1099
+ [
1100
+ 6,
1101
+ 16620,
1102
+ 31996.15625
1103
+ ],
1104
+ [
1105
+ 7,
1106
+ 19390,
1107
+ 31999.515625
1108
+ ]
1109
+ ],
1110
+ "rouge1_fmeasure": [
1111
+ [
1112
+ 1,
1113
+ 2770,
1114
+ 0.2195121943950653
1115
+ ],
1116
+ [
1117
+ 2,
1118
+ 5540,
1119
+ 0.4458082616329193
1120
+ ],
1121
+ [
1122
+ 3,
1123
+ 8310,
1124
+ 0.4645876884460449
1125
+ ],
1126
+ [
1127
+ 4,
1128
+ 11080,
1129
+ 0.5988943576812744
1130
+ ],
1131
+ [
1132
+ 5,
1133
+ 13850,
1134
+ 0.5516705513000488
1135
+ ],
1136
+ [
1137
+ 6,
1138
+ 16620,
1139
+ 0.45763128995895386
1140
+ ],
1141
+ [
1142
+ 7,
1143
+ 19390,
1144
+ 0.5099087953567505
1145
+ ]
1146
+ ],
1147
+ "rouge1_precision": [
1148
+ [
1149
+ 1,
1150
+ 2770,
1151
+ 0.140625
1152
+ ],
1153
+ [
1154
+ 2,
1155
+ 5540,
1156
+ 0.2905769348144531
1157
+ ],
1158
+ [
1159
+ 3,
1160
+ 8310,
1161
+ 0.3042147159576416
1162
+ ],
1163
+ [
1164
+ 4,
1165
+ 11080,
1166
+ 0.43105676770210266
1167
+ ],
1168
+ [
1169
+ 5,
1170
+ 13850,
1171
+ 0.40321463346481323
1172
+ ],
1173
+ [
1174
+ 6,
1175
+ 16620,
1176
+ 0.2989655137062073
1177
+ ],
1178
+ [
1179
+ 7,
1180
+ 19390,
1181
+ 0.34651726484298706
1182
+ ]
1183
+ ],
1184
+ "rouge1_recall": [
1185
+ [
1186
+ 1,
1187
+ 2770,
1188
+ 0.5
1189
+ ],
1190
+ [
1191
+ 2,
1192
+ 5540,
1193
+ 1.0
1194
+ ],
1195
+ [
1196
+ 3,
1197
+ 8310,
1198
+ 1.0
1199
+ ],
1200
+ [
1201
+ 4,
1202
+ 11080,
1203
+ 1.0
1204
+ ],
1205
+ [
1206
+ 5,
1207
+ 13850,
1208
+ 0.9948453903198242
1209
+ ],
1210
+ [
1211
+ 6,
1212
+ 16620,
1213
+ 0.9900000095367432
1214
+ ],
1215
+ [
1216
+ 7,
1217
+ 19390,
1218
+ 1.0
1219
+ ]
1220
+ ],
1221
+ "rouge2_fmeasure": [
1222
+ [
1223
+ 1,
1224
+ 2770,
1225
+ 0.20370370149612427
1226
+ ],
1227
+ [
1228
+ 2,
1229
+ 5540,
1230
+ 0.438376247882843
1231
+ ],
1232
+ [
1233
+ 3,
1234
+ 8310,
1235
+ 0.451366126537323
1236
+ ],
1237
+ [
1238
+ 4,
1239
+ 11080,
1240
+ 0.5951134562492371
1241
+ ],
1242
+ [
1243
+ 5,
1244
+ 13850,
1245
+ 0.5413873195648193
1246
+ ],
1247
+ [
1248
+ 6,
1249
+ 16620,
1250
+ 0.4451362192630768
1251
+ ],
1252
+ [
1253
+ 7,
1254
+ 19390,
1255
+ 0.5034307837486267
1256
+ ]
1257
+ ],
1258
+ "rouge2_precision": [
1259
+ [
1260
+ 1,
1261
+ 2770,
1262
+ 0.12992125749588013
1263
+ ],
1264
+ [
1265
+ 2,
1266
+ 5540,
1267
+ 0.28464847803115845
1268
+ ],
1269
+ [
1270
+ 3,
1271
+ 8310,
1272
+ 0.29411762952804565
1273
+ ],
1274
+ [
1275
+ 4,
1276
+ 11080,
1277
+ 0.4273049831390381
1278
+ ],
1279
+ [
1280
+ 5,
1281
+ 13850,
1282
+ 0.39483463764190674
1283
+ ],
1284
+ [
1285
+ 6,
1286
+ 16620,
1287
+ 0.2894570827484131
1288
+ ],
1289
+ [
1290
+ 7,
1291
+ 19390,
1292
+ 0.3408345580101013
1293
+ ]
1294
+ ],
1295
+ "rouge2_recall": [
1296
+ [
1297
+ 1,
1298
+ 2770,
1299
+ 0.4714285731315613
1300
+ ],
1301
+ [
1302
+ 2,
1303
+ 5540,
1304
+ 1.0
1305
+ ],
1306
+ [
1307
+ 3,
1308
+ 8310,
1309
+ 0.9893617033958435
1310
+ ],
1311
+ [
1312
+ 4,
1313
+ 11080,
1314
+ 1.0
1315
+ ],
1316
+ [
1317
+ 5,
1318
+ 13850,
1319
+ 0.9895833730697632
1320
+ ],
1321
+ [
1322
+ 6,
1323
+ 16620,
1324
+ 0.9795918464660645
1325
+ ],
1326
+ [
1327
+ 7,
1328
+ 19390,
1329
+ 1.0
1330
+ ]
1331
+ ],
1332
+ "rougeL_fmeasure": [
1333
+ [
1334
+ 1,
1335
+ 2770,
1336
+ 0.2195121943950653
1337
+ ],
1338
+ [
1339
+ 2,
1340
+ 5540,
1341
+ 0.4458082616329193
1342
+ ],
1343
+ [
1344
+ 3,
1345
+ 8310,
1346
+ 0.4645876884460449
1347
+ ],
1348
+ [
1349
+ 4,
1350
+ 11080,
1351
+ 0.5949573516845703
1352
+ ],
1353
+ [
1354
+ 5,
1355
+ 13850,
1356
+ 0.5516705513000488
1357
+ ],
1358
+ [
1359
+ 6,
1360
+ 16620,
1361
+ 0.45763128995895386
1362
+ ],
1363
+ [
1364
+ 7,
1365
+ 19390,
1366
+ 0.5099087953567505
1367
+ ]
1368
+ ],
1369
+ "rougeL_precision": [
1370
+ [
1371
+ 1,
1372
+ 2770,
1373
+ 0.140625
1374
+ ],
1375
+ [
1376
+ 2,
1377
+ 5540,
1378
+ 0.2905769348144531
1379
+ ],
1380
+ [
1381
+ 3,
1382
+ 8310,
1383
+ 0.3042147159576416
1384
+ ],
1385
+ [
1386
+ 4,
1387
+ 11080,
1388
+ 0.428098201751709
1389
+ ],
1390
+ [
1391
+ 5,
1392
+ 13850,
1393
+ 0.40321463346481323
1394
+ ],
1395
+ [
1396
+ 6,
1397
+ 16620,
1398
+ 0.2989655137062073
1399
+ ],
1400
+ [
1401
+ 7,
1402
+ 19390,
1403
+ 0.34651726484298706
1404
+ ]
1405
+ ],
1406
+ "rougeL_recall": [
1407
+ [
1408
+ 1,
1409
+ 2770,
1410
+ 0.5
1411
+ ],
1412
+ [
1413
+ 2,
1414
+ 5540,
1415
+ 1.0
1416
+ ],
1417
+ [
1418
+ 3,
1419
+ 8310,
1420
+ 1.0
1421
+ ],
1422
+ [
1423
+ 4,
1424
+ 11080,
1425
+ 0.9941176176071167
1426
+ ],
1427
+ [
1428
+ 5,
1429
+ 13850,
1430
+ 0.9948453903198242
1431
+ ],
1432
+ [
1433
+ 6,
1434
+ 16620,
1435
+ 0.9900000095367432
1436
+ ],
1437
+ [
1438
+ 7,
1439
+ 19390,
1440
+ 1.0
1441
+ ]
1442
+ ],
1443
+ "rougeLsum_fmeasure": [
1444
+ [
1445
+ 1,
1446
+ 2770,
1447
+ 0.2195121943950653
1448
+ ],
1449
+ [
1450
+ 2,
1451
+ 5540,
1452
+ 0.4458082616329193
1453
+ ],
1454
+ [
1455
+ 3,
1456
+ 8310,
1457
+ 0.4645876884460449
1458
+ ],
1459
+ [
1460
+ 4,
1461
+ 11080,
1462
+ 0.5949573516845703
1463
+ ],
1464
+ [
1465
+ 5,
1466
+ 13850,
1467
+ 0.5516705513000488
1468
+ ],
1469
+ [
1470
+ 6,
1471
+ 16620,
1472
+ 0.45763128995895386
1473
+ ],
1474
+ [
1475
+ 7,
1476
+ 19390,
1477
+ 0.5099087953567505
1478
+ ]
1479
+ ],
1480
+ "rougeLsum_precision": [
1481
+ [
1482
+ 1,
1483
+ 2770,
1484
+ 0.140625
1485
+ ],
1486
+ [
1487
+ 2,
1488
+ 5540,
1489
+ 0.2905769348144531
1490
+ ],
1491
+ [
1492
+ 3,
1493
+ 8310,
1494
+ 0.3042147159576416
1495
+ ],
1496
+ [
1497
+ 4,
1498
+ 11080,
1499
+ 0.428098201751709
1500
+ ],
1501
+ [
1502
+ 5,
1503
+ 13850,
1504
+ 0.40321463346481323
1505
+ ],
1506
+ [
1507
+ 6,
1508
+ 16620,
1509
+ 0.2989655137062073
1510
+ ],
1511
+ [
1512
+ 7,
1513
+ 19390,
1514
+ 0.34651726484298706
1515
+ ]
1516
+ ],
1517
+ "rougeLsum_recall": [
1518
+ [
1519
+ 1,
1520
+ 2770,
1521
+ 0.5
1522
+ ],
1523
+ [
1524
+ 2,
1525
+ 5540,
1526
+ 1.0
1527
+ ],
1528
+ [
1529
+ 3,
1530
+ 8310,
1531
+ 1.0
1532
+ ],
1533
+ [
1534
+ 4,
1535
+ 11080,
1536
+ 0.9941176176071167
1537
+ ],
1538
+ [
1539
+ 5,
1540
+ 13850,
1541
+ 0.9948453903198242
1542
+ ],
1543
+ [
1544
+ 6,
1545
+ 16620,
1546
+ 0.9900000095367432
1547
+ ],
1548
+ [
1549
+ 7,
1550
+ 19390,
1551
+ 1.0
1552
+ ]
1553
+ ],
1554
+ "sequence_accuracy": [
1555
+ [
1556
+ 1,
1557
+ 2770,
1558
+ 0.0
1559
+ ],
1560
+ [
1561
+ 2,
1562
+ 5540,
1563
+ 0.0
1564
+ ],
1565
+ [
1566
+ 3,
1567
+ 8310,
1568
+ 0.0
1569
+ ],
1570
+ [
1571
+ 4,
1572
+ 11080,
1573
+ 0.0
1574
+ ],
1575
+ [
1576
+ 5,
1577
+ 13850,
1578
+ 0.0
1579
+ ],
1580
+ [
1581
+ 6,
1582
+ 16620,
1583
+ 0.0
1584
+ ],
1585
+ [
1586
+ 7,
1587
+ 19390,
1588
+ 0.0
1589
+ ]
1590
+ ],
1591
+ "token_accuracy": [
1592
+ [
1593
+ 1,
1594
+ 2770,
1595
+ 0.0
1596
+ ],
1597
+ [
1598
+ 2,
1599
+ 5540,
1600
+ 0.0011185682378709316
1601
+ ],
1602
+ [
1603
+ 3,
1604
+ 8310,
1605
+ 0.0
1606
+ ],
1607
+ [
1608
+ 4,
1609
+ 11080,
1610
+ 0.0004681647988036275
1611
+ ],
1612
+ [
1613
+ 5,
1614
+ 13850,
1615
+ 0.0
1616
+ ],
1617
+ [
1618
+ 6,
1619
+ 16620,
1620
+ 0.0
1621
+ ],
1622
+ [
1623
+ 7,
1624
+ 19390,
1625
+ 0.0
1626
+ ]
1627
+ ],
1628
+ "word_error_rate": [
1629
+ [
1630
+ 1,
1631
+ 2770,
1632
+ 4.363636493682861
1633
+ ],
1634
+ [
1635
+ 2,
1636
+ 5540,
1637
+ 2.318840503692627
1638
+ ],
1639
+ [
1640
+ 3,
1641
+ 8310,
1642
+ 2.492063522338867
1643
+ ],
1644
+ [
1645
+ 4,
1646
+ 11080,
1647
+ 1.4741379022598267
1648
+ ],
1649
+ [
1650
+ 5,
1651
+ 13850,
1652
+ 1.7572815418243408
1653
+ ],
1654
+ [
1655
+ 6,
1656
+ 16620,
1657
+ 2.890625
1658
+ ],
1659
+ [
1660
+ 7,
1661
+ 19390,
1662
+ 2.925373077392578
1663
+ ]
1664
+ ]
1665
+ }
1666
+ },
1667
+ "tune_checkpoint_num": 0,
1668
+ "validation_metrics": {
1669
+ "combined": {
1670
+ "loss": [
1671
+ [
1672
+ 1,
1673
+ 2770,
1674
+ 0.056642960757017136
1675
+ ],
1676
+ [
1677
+ 2,
1678
+ 5540,
1679
+ 0.03854465112090111
1680
+ ],
1681
+ [
1682
+ 3,
1683
+ 8310,
1684
+ 0.03133770078420639
1685
+ ],
1686
+ [
1687
+ 4,
1688
+ 11080,
1689
+ 0.029445933178067207
1690
+ ],
1691
+ [
1692
+ 5,
1693
+ 13850,
1694
+ 0.0286291241645813
1695
+ ],
1696
+ [
1697
+ 6,
1698
+ 16620,
1699
+ 0.028967903926968575
1700
+ ],
1701
+ [
1702
+ 7,
1703
+ 19390,
1704
+ 0.028326265513896942
1705
+ ]
1706
+ ]
1707
+ },
1708
+ "output": {
1709
+ "bleu": [
1710
+ [
1711
+ 1,
1712
+ 2770,
1713
+ 0.294668972492218
1714
+ ],
1715
+ [
1716
+ 2,
1717
+ 5540,
1718
+ 0.30392342805862427
1719
+ ],
1720
+ [
1721
+ 3,
1722
+ 8310,
1723
+ 0.2956589162349701
1724
+ ],
1725
+ [
1726
+ 4,
1727
+ 11080,
1728
+ 0.2938247621059418
1729
+ ],
1730
+ [
1731
+ 5,
1732
+ 13850,
1733
+ 0.2953544855117798
1734
+ ],
1735
+ [
1736
+ 6,
1737
+ 16620,
1738
+ 0.2841779291629791
1739
+ ],
1740
+ [
1741
+ 7,
1742
+ 19390,
1743
+ 0.2575768232345581
1744
+ ]
1745
+ ],
1746
+ "char_error_rate": [
1747
+ [
1748
+ 1,
1749
+ 2770,
1750
+ 3.8609488010406494
1751
+ ],
1752
+ [
1753
+ 2,
1754
+ 5540,
1755
+ 2.8585927486419678
1756
+ ],
1757
+ [
1758
+ 3,
1759
+ 8310,
1760
+ 2.888094663619995
1761
+ ],
1762
+ [
1763
+ 4,
1764
+ 11080,
1765
+ 2.827108383178711
1766
+ ],
1767
+ [
1768
+ 5,
1769
+ 13850,
1770
+ 2.8774003982543945
1771
+ ],
1772
+ [
1773
+ 6,
1774
+ 16620,
1775
+ 2.8187196254730225
1776
+ ],
1777
+ [
1778
+ 7,
1779
+ 19390,
1780
+ 2.8845763206481934
1781
+ ]
1782
+ ],
1783
+ "loss": [
1784
+ [
1785
+ 1,
1786
+ 2770,
1787
+ 0.056642960757017136
1788
+ ],
1789
+ [
1790
+ 2,
1791
+ 5540,
1792
+ 0.03854465112090111
1793
+ ],
1794
+ [
1795
+ 3,
1796
+ 8310,
1797
+ 0.03133770078420639
1798
+ ],
1799
+ [
1800
+ 4,
1801
+ 11080,
1802
+ 0.029445933178067207
1803
+ ],
1804
+ [
1805
+ 5,
1806
+ 13850,
1807
+ 0.0286291241645813
1808
+ ],
1809
+ [
1810
+ 6,
1811
+ 16620,
1812
+ 0.028967903926968575
1813
+ ],
1814
+ [
1815
+ 7,
1816
+ 19390,
1817
+ 0.028326265513896942
1818
+ ]
1819
+ ],
1820
+ "next_token_perplexity": [
1821
+ [
1822
+ 1,
1823
+ 2770,
1824
+ 12037.177734375
1825
+ ],
1826
+ [
1827
+ 2,
1828
+ 5540,
1829
+ 11975.810546875
1830
+ ],
1831
+ [
1832
+ 3,
1833
+ 8310,
1834
+ 11929.7431640625
1835
+ ],
1836
+ [
1837
+ 4,
1838
+ 11080,
1839
+ 11916.7890625
1840
+ ],
1841
+ [
1842
+ 5,
1843
+ 13850,
1844
+ 11905.4638671875
1845
+ ],
1846
+ [
1847
+ 6,
1848
+ 16620,
1849
+ 11907.5517578125
1850
+ ],
1851
+ [
1852
+ 7,
1853
+ 19390,
1854
+ 11894.1640625
1855
+ ]
1856
+ ],
1857
+ "perplexity": [
1858
+ [
1859
+ 1,
1860
+ 2770,
1861
+ 31882.267578125
1862
+ ],
1863
+ [
1864
+ 2,
1865
+ 5540,
1866
+ 31876.583984375
1867
+ ],
1868
+ [
1869
+ 3,
1870
+ 8310,
1871
+ 31884.275390625
1872
+ ],
1873
+ [
1874
+ 4,
1875
+ 11080,
1876
+ 31881.873046875
1877
+ ],
1878
+ [
1879
+ 5,
1880
+ 13850,
1881
+ 31888.412109375
1882
+ ],
1883
+ [
1884
+ 6,
1885
+ 16620,
1886
+ 31880.4453125
1887
+ ],
1888
+ [
1889
+ 7,
1890
+ 19390,
1891
+ 31888.74609375
1892
+ ]
1893
+ ],
1894
+ "rouge1_fmeasure": [
1895
+ [
1896
+ 1,
1897
+ 2770,
1898
+ 0.4522210359573364
1899
+ ],
1900
+ [
1901
+ 2,
1902
+ 5540,
1903
+ 0.4561161994934082
1904
+ ],
1905
+ [
1906
+ 3,
1907
+ 8310,
1908
+ 0.46019619703292847
1909
+ ],
1910
+ [
1911
+ 4,
1912
+ 11080,
1913
+ 0.47886744141578674
1914
+ ],
1915
+ [
1916
+ 5,
1917
+ 13850,
1918
+ 0.47372379899024963
1919
+ ],
1920
+ [
1921
+ 6,
1922
+ 16620,
1923
+ 0.48917827010154724
1924
+ ],
1925
+ [
1926
+ 7,
1927
+ 19390,
1928
+ 0.5003785490989685
1929
+ ]
1930
+ ],
1931
+ "rouge1_precision": [
1932
+ [
1933
+ 1,
1934
+ 2770,
1935
+ 0.30646950006484985
1936
+ ],
1937
+ [
1938
+ 2,
1939
+ 5540,
1940
+ 0.309090793132782
1941
+ ],
1942
+ [
1943
+ 3,
1944
+ 8310,
1945
+ 0.31218209862709045
1946
+ ],
1947
+ [
1948
+ 4,
1949
+ 11080,
1950
+ 0.3286076486110687
1951
+ ],
1952
+ [
1953
+ 5,
1954
+ 13850,
1955
+ 0.32413795590400696
1956
+ ],
1957
+ [
1958
+ 6,
1959
+ 16620,
1960
+ 0.3378482758998871
1961
+ ],
1962
+ [
1963
+ 7,
1964
+ 19390,
1965
+ 0.3479803204536438
1966
+ ]
1967
+ ],
1968
+ "rouge1_recall": [
1969
+ [
1970
+ 1,
1971
+ 2770,
1972
+ 0.947241485118866
1973
+ ],
1974
+ [
1975
+ 2,
1976
+ 5540,
1977
+ 0.9578168988227844
1978
+ ],
1979
+ [
1980
+ 3,
1981
+ 8310,
1982
+ 0.9620707035064697
1983
+ ],
1984
+ [
1985
+ 4,
1986
+ 11080,
1987
+ 0.963258683681488
1988
+ ],
1989
+ [
1990
+ 5,
1991
+ 13850,
1992
+ 0.9619985818862915
1993
+ ],
1994
+ [
1995
+ 6,
1996
+ 16620,
1997
+ 0.9631357192993164
1998
+ ],
1999
+ [
2000
+ 7,
2001
+ 19390,
2002
+ 0.9642844796180725
2003
+ ]
2004
+ ],
2005
+ "rouge2_fmeasure": [
2006
+ [
2007
+ 1,
2008
+ 2770,
2009
+ 0.42693406343460083
2010
+ ],
2011
+ [
2012
+ 2,
2013
+ 5540,
2014
+ 0.4367271065711975
2015
+ ],
2016
+ [
2017
+ 3,
2018
+ 8310,
2019
+ 0.4432367980480194
2020
+ ],
2021
+ [
2022
+ 4,
2023
+ 11080,
2024
+ 0.46300628781318665
2025
+ ],
2026
+ [
2027
+ 5,
2028
+ 13850,
2029
+ 0.45774954557418823
2030
+ ],
2031
+ [
2032
+ 6,
2033
+ 16620,
2034
+ 0.4741453528404236
2035
+ ],
2036
+ [
2037
+ 7,
2038
+ 19390,
2039
+ 0.4844965636730194
2040
+ ]
2041
+ ],
2042
+ "rouge2_precision": [
2043
+ [
2044
+ 1,
2045
+ 2770,
2046
+ 0.28853029012680054
2047
+ ],
2048
+ [
2049
+ 2,
2050
+ 5540,
2051
+ 0.2951086163520813
2052
+ ],
2053
+ [
2054
+ 3,
2055
+ 8310,
2056
+ 0.29979175329208374
2057
+ ],
2058
+ [
2059
+ 4,
2060
+ 11080,
2061
+ 0.3167739510536194
2062
+ ],
2063
+ [
2064
+ 5,
2065
+ 13850,
2066
+ 0.31229183077812195
2067
+ ],
2068
+ [
2069
+ 6,
2070
+ 16620,
2071
+ 0.32645225524902344
2072
+ ],
2073
+ [
2074
+ 7,
2075
+ 19390,
2076
+ 0.3358590602874756
2077
+ ]
2078
+ ],
2079
+ "rouge2_recall": [
2080
+ [
2081
+ 1,
2082
+ 2770,
2083
+ 0.9076516628265381
2084
+ ],
2085
+ [
2086
+ 2,
2087
+ 5540,
2088
+ 0.9309775829315186
2089
+ ],
2090
+ [
2091
+ 3,
2092
+ 8310,
2093
+ 0.9407065510749817
2094
+ ],
2095
+ [
2096
+ 4,
2097
+ 11080,
2098
+ 0.9448210000991821
2099
+ ],
2100
+ [
2101
+ 5,
2102
+ 13850,
2103
+ 0.9431052803993225
2104
+ ],
2105
+ [
2106
+ 6,
2107
+ 16620,
2108
+ 0.9469495415687561
2109
+ ],
2110
+ [
2111
+ 7,
2112
+ 19390,
2113
+ 0.946923017501831
2114
+ ]
2115
+ ],
2116
+ "rougeL_fmeasure": [
2117
+ [
2118
+ 1,
2119
+ 2770,
2120
+ 0.4507341980934143
2121
+ ],
2122
+ [
2123
+ 2,
2124
+ 5540,
2125
+ 0.4551938772201538
2126
+ ],
2127
+ [
2128
+ 3,
2129
+ 8310,
2130
+ 0.4593889117240906
2131
+ ],
2132
+ [
2133
+ 4,
2134
+ 11080,
2135
+ 0.47822490334510803
2136
+ ],
2137
+ [
2138
+ 5,
2139
+ 13850,
2140
+ 0.47310617566108704
2141
+ ],
2142
+ [
2143
+ 6,
2144
+ 16620,
2145
+ 0.4886751174926758
2146
+ ],
2147
+ [
2148
+ 7,
2149
+ 19390,
2150
+ 0.49971580505371094
2151
+ ]
2152
+ ],
2153
+ "rougeL_precision": [
2154
+ [
2155
+ 1,
2156
+ 2770,
2157
+ 0.3054444193840027
2158
+ ],
2159
+ [
2160
+ 2,
2161
+ 5540,
2162
+ 0.30845171213150024
2163
+ ],
2164
+ [
2165
+ 3,
2166
+ 8310,
2167
+ 0.3116239011287689
2168
+ ],
2169
+ [
2170
+ 4,
2171
+ 11080,
2172
+ 0.3281573951244354
2173
+ ],
2174
+ [
2175
+ 5,
2176
+ 13850,
2177
+ 0.32370176911354065
2178
+ ],
2179
+ [
2180
+ 6,
2181
+ 16620,
2182
+ 0.3374961018562317
2183
+ ],
2184
+ [
2185
+ 7,
2186
+ 19390,
2187
+ 0.34750500321388245
2188
+ ]
2189
+ ],
2190
+ "rougeL_recall": [
2191
+ [
2192
+ 1,
2193
+ 2770,
2194
+ 0.9443382620811462
2195
+ ],
2196
+ [
2197
+ 2,
2198
+ 5540,
2199
+ 0.95603346824646
2200
+ ],
2201
+ [
2202
+ 3,
2203
+ 8310,
2204
+ 0.9604740738868713
2205
+ ],
2206
+ [
2207
+ 4,
2208
+ 11080,
2209
+ 0.962070882320404
2210
+ ],
2211
+ [
2212
+ 5,
2213
+ 13850,
2214
+ 0.9608785510063171
2215
+ ],
2216
+ [
2217
+ 6,
2218
+ 16620,
2219
+ 0.9621948599815369
2220
+ ],
2221
+ [
2222
+ 7,
2223
+ 19390,
2224
+ 0.9631230235099792
2225
+ ]
2226
+ ],
2227
+ "rougeLsum_fmeasure": [
2228
+ [
2229
+ 1,
2230
+ 2770,
2231
+ 0.4507194757461548
2232
+ ],
2233
+ [
2234
+ 2,
2235
+ 5540,
2236
+ 0.4550528824329376
2237
+ ],
2238
+ [
2239
+ 3,
2240
+ 8310,
2241
+ 0.45933279395103455
2242
+ ],
2243
+ [
2244
+ 4,
2245
+ 11080,
2246
+ 0.47812482714653015
2247
+ ],
2248
+ [
2249
+ 5,
2250
+ 13850,
2251
+ 0.4730019271373749
2252
+ ],
2253
+ [
2254
+ 6,
2255
+ 16620,
2256
+ 0.48857295513153076
2257
+ ],
2258
+ [
2259
+ 7,
2260
+ 19390,
2261
+ 0.499582976102829
2262
+ ]
2263
+ ],
2264
+ "rougeLsum_precision": [
2265
+ [
2266
+ 1,
2267
+ 2770,
2268
+ 0.30543628334999084
2269
+ ],
2270
+ [
2271
+ 2,
2272
+ 5540,
2273
+ 0.30835309624671936
2274
+ ],
2275
+ [
2276
+ 3,
2277
+ 8310,
2278
+ 0.31157732009887695
2279
+ ],
2280
+ [
2281
+ 4,
2282
+ 11080,
2283
+ 0.32808494567871094
2284
+ ],
2285
+ [
2286
+ 5,
2287
+ 13850,
2288
+ 0.3236253559589386
2289
+ ],
2290
+ [
2291
+ 6,
2292
+ 16620,
2293
+ 0.33741870522499084
2294
+ ],
2295
+ [
2296
+ 7,
2297
+ 19390,
2298
+ 0.34741097688674927
2299
+ ]
2300
+ ],
2301
+ "rougeLsum_recall": [
2302
+ [
2303
+ 1,
2304
+ 2770,
2305
+ 0.9442805647850037
2306
+ ],
2307
+ [
2308
+ 2,
2309
+ 5540,
2310
+ 0.9557678699493408
2311
+ ],
2312
+ [
2313
+ 3,
2314
+ 8310,
2315
+ 0.9604008197784424
2316
+ ],
2317
+ [
2318
+ 4,
2319
+ 11080,
2320
+ 0.9618760943412781
2321
+ ],
2322
+ [
2323
+ 5,
2324
+ 13850,
2325
+ 0.9606844186782837
2326
+ ],
2327
+ [
2328
+ 6,
2329
+ 16620,
2330
+ 0.9620285034179688
2331
+ ],
2332
+ [
2333
+ 7,
2334
+ 19390,
2335
+ 0.9628673195838928
2336
+ ]
2337
+ ],
2338
+ "sequence_accuracy": [
2339
+ [
2340
+ 1,
2341
+ 2770,
2342
+ 0.0
2343
+ ],
2344
+ [
2345
+ 2,
2346
+ 5540,
2347
+ 0.0
2348
+ ],
2349
+ [
2350
+ 3,
2351
+ 8310,
2352
+ 0.0
2353
+ ],
2354
+ [
2355
+ 4,
2356
+ 11080,
2357
+ 0.0
2358
+ ],
2359
+ [
2360
+ 5,
2361
+ 13850,
2362
+ 0.0
2363
+ ],
2364
+ [
2365
+ 6,
2366
+ 16620,
2367
+ 0.0
2368
+ ],
2369
+ [
2370
+ 7,
2371
+ 19390,
2372
+ 0.0
2373
+ ]
2374
+ ],
2375
+ "token_accuracy": [
2376
+ [
2377
+ 1,
2378
+ 2770,
2379
+ 0.00046871474478393793
2380
+ ],
2381
+ [
2382
+ 2,
2383
+ 5540,
2384
+ 0.00047089491272345185
2385
+ ],
2386
+ [
2387
+ 3,
2388
+ 8310,
2389
+ 0.0004673982912208885
2390
+ ],
2391
+ [
2392
+ 4,
2393
+ 11080,
2394
+ 0.0004686332249548286
2395
+ ],
2396
+ [
2397
+ 5,
2398
+ 13850,
2399
+ 0.0004653023788705468
2400
+ ],
2401
+ [
2402
+ 6,
2403
+ 16620,
2404
+ 0.0004659830592572689
2405
+ ],
2406
+ [
2407
+ 7,
2408
+ 19390,
2409
+ 0.00046722154365852475
2410
+ ]
2411
+ ],
2412
+ "word_error_rate": [
2413
+ [
2414
+ 1,
2415
+ 2770,
2416
+ 1.997908115386963
2417
+ ],
2418
+ [
2419
+ 2,
2420
+ 5540,
2421
+ 2.019251585006714
2422
+ ],
2423
+ [
2424
+ 3,
2425
+ 8310,
2426
+ 2.154338836669922
2427
+ ],
2428
+ [
2429
+ 4,
2430
+ 11080,
2431
+ 2.2291877269744873
2432
+ ],
2433
+ [
2434
+ 5,
2435
+ 13850,
2436
+ 2.173459768295288
2437
+ ],
2438
+ [
2439
+ 6,
2440
+ 16620,
2441
+ 2.2866809368133545
2442
+ ],
2443
+ [
2444
+ 7,
2445
+ 19390,
2446
+ 2.64115047454834
2447
+ ]
2448
+ ]
2449
+ }
2450
+ }
2451
+ }
training_set_metadata.json ADDED
The diff for this file is too large to render. See raw diff