noeloco commited on
Commit
7822d67
1 Parent(s): 1c0a243

End of training

Browse files
Files changed (2) hide show
  1. README.md +17 -25
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -33,16 +33,15 @@ datasets:
33
  type: alpaca
34
  ds_type: json
35
 
36
- dataset_prepared_path: noeloco/cameltest
 
37
  val_set_size: 0.05
38
  output_dir: ./lora-out
39
-
40
  hub_model_id: noeloco/modeltest1
41
- hf_use_auth_token: true
42
 
43
  sequence_len: 2048
44
- sample_packing: true
45
- eval_sample_packing: False
46
  pad_to_sequence_len: true
47
 
48
  adapter: lora
@@ -61,7 +60,7 @@ wandb_log_model:
61
 
62
  gradient_accumulation_steps: 4
63
  micro_batch_size: 2
64
- num_epochs: 1
65
  optimizer: adamw_bnb_8bit
66
  lr_scheduler: cosine
67
  learning_rate: 0.0002
@@ -78,7 +77,7 @@ resume_from_checkpoint:
78
  local_rank:
79
  logging_steps: 1
80
  xformers_attention:
81
- flash_attention: true
82
 
83
  warmup_steps: 10
84
  evals_per_epoch: 4
@@ -101,7 +100,7 @@ special_tokens:
101
 
102
  This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on the None dataset.
103
  It achieves the following results on the evaluation set:
104
- - Loss: 2.7345
105
 
106
  ## Model description
107
 
@@ -117,19 +116,6 @@ More information needed
117
 
118
  ## Training procedure
119
 
120
-
121
- The following `bitsandbytes` quantization config was used during training:
122
- - quant_method: bitsandbytes
123
- - load_in_8bit: True
124
- - load_in_4bit: False
125
- - llm_int8_threshold: 6.0
126
- - llm_int8_skip_modules: None
127
- - llm_int8_enable_fp32_cpu_offload: False
128
- - llm_int8_has_fp16_weight: False
129
- - bnb_4bit_quant_type: fp4
130
- - bnb_4bit_use_double_quant: False
131
- - bnb_4bit_compute_dtype: float32
132
-
133
  ### Training hyperparameters
134
 
135
  The following hyperparameters were used during training:
@@ -142,19 +128,25 @@ The following hyperparameters were used during training:
142
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
143
  - lr_scheduler_type: cosine
144
  - lr_scheduler_warmup_steps: 10
145
- - num_epochs: 1
146
 
147
  ### Training results
148
 
149
  | Training Loss | Epoch | Step | Validation Loss |
150
  |:-------------:|:-----:|:----:|:---------------:|
151
- | 1.7285 | 1.0 | 1 | 2.7345 |
 
 
 
 
 
 
152
 
153
 
154
  ### Framework versions
155
 
156
- - PEFT 0.7.0
157
- - Transformers 4.37.0.dev0
158
  - Pytorch 2.0.1+cu118
159
  - Datasets 2.16.1
160
  - Tokenizers 0.15.0
 
33
  type: alpaca
34
  ds_type: json
35
 
36
+ hf_use_auth_token: true
37
+ push_dataset_to_hub: noeloco
38
  val_set_size: 0.05
39
  output_dir: ./lora-out
40
+ chat_template: chatml
41
  hub_model_id: noeloco/modeltest1
 
42
 
43
  sequence_len: 2048
44
+ sample_packing: false
 
45
  pad_to_sequence_len: true
46
 
47
  adapter: lora
 
60
 
61
  gradient_accumulation_steps: 4
62
  micro_batch_size: 2
63
+ num_epochs: 2
64
  optimizer: adamw_bnb_8bit
65
  lr_scheduler: cosine
66
  learning_rate: 0.0002
 
77
  local_rank:
78
  logging_steps: 1
79
  xformers_attention:
80
+ flash_attention: false
81
 
82
  warmup_steps: 10
83
  evals_per_epoch: 4
 
100
 
101
  This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on the None dataset.
102
  It achieves the following results on the evaluation set:
103
+ - Loss: 0.1202
104
 
105
  ## Model description
106
 
 
116
 
117
  ## Training procedure
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
  ### Training hyperparameters
120
 
121
  The following hyperparameters were used during training:
 
128
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
129
  - lr_scheduler_type: cosine
130
  - lr_scheduler_warmup_steps: 10
131
+ - num_epochs: 2
132
 
133
  ### Training results
134
 
135
  | Training Loss | Epoch | Step | Validation Loss |
136
  |:-------------:|:-----:|:----:|:---------------:|
137
+ | 1.5644 | 0.06 | 1 | 2.7399 |
138
+ | 1.575 | 0.29 | 5 | 2.6344 |
139
+ | 1.1169 | 0.57 | 10 | 1.2350 |
140
+ | 0.6719 | 0.86 | 15 | 0.5019 |
141
+ | 0.3372 | 1.14 | 20 | 0.2525 |
142
+ | 0.3403 | 1.43 | 25 | 0.1470 |
143
+ | 0.1656 | 1.71 | 30 | 0.1202 |
144
 
145
 
146
  ### Framework versions
147
 
148
+ - PEFT 0.7.2.dev0
149
+ - Transformers 4.37.0
150
  - Pytorch 2.0.1+cu118
151
  - Datasets 2.16.1
152
  - Tokenizers 0.15.0
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e21d98e65be8ce1f9ac99ceb19b3bd3aa73dd696cc7f7c22965f55c745f9b567
3
  size 80114765
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54e31da3d13f2c089248c59cf1951dfe80272e8d85a2e8ce4cbefb2aaa759ee1
3
  size 80114765