baptistecolle
/

mistral-voyager-finetune

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

baptistecolle HF staff commited on Nov 16, 2023

Commit

8de2310

•

1 Parent(s): cbffeef

End of training

Files changed (4) hide show

README.md +6 -7
adapter_config.json +5 -5
adapter_model.safetensors +1 -1
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1075
 ## Model description
@@ -42,17 +42,16 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
 - training_steps: 500
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.097         | 0.48  | 100  | 0.1216          |
-| 0.0795        | 0.95  | 200  | 0.1155          |
-| 0.0448        | 1.43  | 300  | 0.1142          |
-| 0.0516        | 1.9   | 400  | 0.1057          |
-| 0.0365        | 2.38  | 500  | 0.1075          |
 ### Framework versions

 This model is a fine-tuned version of [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1123
 ## Model description
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
 - training_steps: 500
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.2728        | 0.48  | 100  | 0.1881          |
+| 0.1854        | 0.95  | 200  | 0.1352          |
+| 0.09          | 1.43  | 300  | 0.1251          |
+| 0.0902        | 1.9   | 400  | 0.1121          |
+| 0.062         | 2.38  | 500  | 0.1123          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -17,13 +17,13 @@
   "revision": null,
   "target_modules": [
     "o_proj",
-    "gate_proj",
-    "v_proj",
     "down_proj",
-    "k_proj",
-    "up_proj",
     "lm_head",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "revision": null,
   "target_modules": [
     "o_proj",
     "down_proj",
+    "q_proj",
     "lm_head",
+    "v_proj",
+    "gate_proj",
+    "up_proj",
+    "k_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cc598f3e72c17e2104519e2a845ec3f137c235d6842cc279feae394654374724
 size 340225480

 version https://git-lfs.github.com/spec/v1
+oid sha256:bf8d98e8df7f05494677f12cbe67111ef7eb52de0013cbeead0cf52b57f07bfd
 size 340225480

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ddac5123a1f7ef3bb7ded41862e2bd168332ef032599656ec5ad785fcc46f053
-size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:339cb40a33201baae08801e23f366086440bd7c0036cfe4e37d1e9457c001aeb
+size 4600