Model save

Browse files

Files changed (6) hide show

README.md +79 -0
all_results.json +9 -0
generation_config.json +12 -0
runs/Sep11_13-48-23_jjb_prism_dev2/events.out.tfevents.1726030716.jjb_prism_dev2.69219.0 +2 -2
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+library_name: transformers
+license: llama3.1
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+tags:
+- trl
+- cpo
+- generated_from_trainer
+model-index:
+- name: llama3.1-cpo_j-full-0911
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# llama3.1-cpo_j-full-0911
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.4373
+- Rewards/chosen: -14.1493
+- Rewards/rejected: -15.5710
+- Rewards/accuracies: 0.6543
+- Rewards/margins: 1.4217
+- Logps/rejected: -155.7095
+- Logps/chosen: -141.4926
+- Logits/rejected: -0.1136
+- Logits/chosen: -0.1476
+- Nll Loss: 0.1725
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-06
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 128
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|
+| 1.4367        | 0.9986 | 432  | 1.3926          | -17.0679       | -18.0962         | 0.6565             | 1.0283          | -180.9624      | -170.6792    | -0.4080         | -0.4373       | 0.3200   |
+| 0.5472        | 1.9994 | 865  | 1.2973          | -16.2909       | -17.5852         | 0.6587             | 1.2944          | -175.8523      | -162.9086    | -0.5434         | -0.5688       | 0.2148   |
+| 0.2244        | 2.9980 | 1297 | 1.3861          | -15.7105       | -17.2195         | 0.6565             | 1.5089          | -172.1945      | -157.1052    | -0.3428         | -0.3715       | 0.2034   |
+| 0.1472        | 3.9988 | 1730 | 1.4029          | -14.6462       | -16.1385         | 0.6522             | 1.4923          | -161.3849      | -146.4623    | -0.2701         | -0.3029       | 0.1876   |
+| 0.1143        | 4.9928 | 2160 | 1.4373          | -14.1493       | -15.5710         | 0.6543             | 1.4217          | -155.7095      | -141.4926    | -0.1136         | -0.1476       | 0.1725   |
+### Framework versions
+- Transformers 4.44.2
+- Pytorch 2.3.1
+- Datasets 2.21.0
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 4.9927766541462,
+    "total_flos": 0.0,
+    "train_loss": 0.5995175864961412,
+    "train_runtime": 46944.6998,
+    "train_samples": 55376,
+    "train_samples_per_second": 5.898,
+    "train_steps_per_second": 0.046
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "bos_token_id": 128000,
+  "do_sample": true,
+  "eos_token_id": [
+    128001,
+    128008,
+    128009
+  ],
+  "temperature": 0.6,
+  "top_p": 0.9,
+  "transformers_version": "4.44.2"
+}

runs/Sep11_13-48-23_jjb_prism_dev2/events.out.tfevents.1726030716.jjb_prism_dev2.69219.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:44aeca2f186aff12912d018380c1932ec18d93a4e595277370e46835c77623e9
-size 168792

 version https://git-lfs.github.com/spec/v1
+oid sha256:885690c34eb19be3d757dd39544336fa7df5bd0866ab3993e0d8ede89d540110
+size 169938

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 4.9927766541462,
+    "total_flos": 0.0,
+    "train_loss": 0.5995175864961412,
+    "train_runtime": 46944.6998,
+    "train_samples": 55376,
+    "train_samples_per_second": 5.898,
+    "train_steps_per_second": 0.046
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff