End of training

Browse files

Files changed (5) hide show

README.md +34 -34
adapter_config.json +1 -2
adapter_model.bin +2 -2
model.safetensors +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0793
 ## Model description
@@ -50,39 +50,39 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 4.0777        | 0.09  | 10   | 0.4600          |
-| 0.2898        | 0.18  | 20   | 0.2397          |
-| 0.2373        | 0.27  | 30   | 0.2282          |
-| 0.2193        | 0.36  | 40   | 0.2107          |
-| 0.2103        | 0.45  | 50   | 0.2865          |
-| 0.1796        | 0.54  | 60   | 0.1327          |
-| 1.8078        | 0.63  | 70   | 1.0824          |
-| 0.2232        | 0.73  | 80   | 0.0985          |
-| 0.1319        | 0.82  | 90   | 0.1969          |
-| 0.1317        | 0.91  | 100  | 0.1082          |
-| 0.0954        | 1.0   | 110  | 0.1045          |
-| 0.0735        | 1.09  | 120  | 0.0701          |
-| 0.07          | 1.18  | 130  | 0.0840          |
-| 0.0742        | 1.27  | 140  | 0.0764          |
-| 0.0699        | 1.36  | 150  | 0.0727          |
-| 0.0727        | 1.45  | 160  | 0.0744          |
-| 0.0642        | 1.54  | 170  | 0.0733          |
-| 0.0702        | 1.63  | 180  | 0.0726          |
-| 0.0657        | 1.72  | 190  | 0.0670          |
-| 0.0605        | 1.81  | 200  | 0.0661          |
-| 0.0625        | 1.9   | 210  | 0.0747          |
-| 0.0603        | 1.99  | 220  | 0.0679          |
-| 0.0318        | 2.08  | 230  | 0.0775          |
-| 0.0275        | 2.18  | 240  | 0.0901          |
-| 0.026         | 2.27  | 250  | 0.0907          |
-| 0.023         | 2.36  | 260  | 0.0831          |
-| 0.0281        | 2.45  | 270  | 0.0792          |
-| 0.0197        | 2.54  | 280  | 0.0813          |
-| 0.0203        | 2.63  | 290  | 0.0850          |
-| 0.0281        | 2.72  | 300  | 0.0830          |
-| 0.0277        | 2.81  | 310  | 0.0809          |
-| 0.0248        | 2.9   | 320  | 0.0798          |
-| 0.0257        | 2.99  | 330  | 0.0793          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0635
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 4.4865        | 0.09  | 10   | 0.8844          |
+| 0.3885        | 0.18  | 20   | 0.2386          |
+| 0.2667        | 0.27  | 30   | 0.2422          |
+| 0.2338        | 0.36  | 40   | 0.2291          |
+| 0.2321        | 0.45  | 50   | 0.2215          |
+| 0.226         | 0.54  | 60   | 0.2156          |
+| 0.2283        | 0.63  | 70   | 0.2006          |
+| 0.2115        | 0.73  | 80   | 0.2034          |
+| 0.1803        | 0.82  | 90   | 0.1707          |
+| 0.1687        | 0.91  | 100  | 0.2004          |
+| 0.1851        | 1.0   | 110  | 0.1699          |
+| 0.1641        | 1.09  | 120  | 0.1648          |
+| 0.1642        | 1.18  | 130  | 0.1666          |
+| 0.1748        | 1.27  | 140  | 0.1629          |
+| 0.1665        | 1.36  | 150  | 0.1627          |
+| 0.134         | 1.45  | 160  | 0.1023          |
+| 0.1085        | 1.54  | 170  | 0.0887          |
+| 0.0876        | 1.63  | 180  | 0.0784          |
+| 0.0749        | 1.72  | 190  | 0.0729          |
+| 0.0673        | 1.81  | 200  | 0.0721          |
+| 0.0687        | 1.9   | 210  | 0.0764          |
+| 0.0648        | 1.99  | 220  | 0.0698          |
+| 0.0477        | 2.08  | 230  | 0.0734          |
+| 0.0522        | 2.18  | 240  | 0.0681          |
+| 0.0437        | 2.27  | 250  | 0.0679          |
+| 0.0437        | 2.36  | 260  | 0.0661          |
+| 0.0497        | 2.45  | 270  | 0.0651          |
+| 0.0437        | 2.54  | 280  | 0.0651          |
+| 0.0433        | 2.63  | 290  | 0.0651          |
+| 0.0464        | 2.72  | 300  | 0.0649          |
+| 0.0558        | 2.81  | 310  | 0.0640          |
+| 0.0455        | 2.9   | 320  | 0.0635          |
+| 0.0469        | 2.99  | 330  | 0.0635          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -41,8 +41,7 @@
   "simple_hidden_matching": false,
   "simple_instance_matching": true,
   "target_modules": [
-    "qkv_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "token_dim": 3072

   "simple_hidden_matching": false,
   "simple_instance_matching": true,
   "target_modules": [
+    "qkv_proj"
   ],
   "task_type": "CAUSAL_LM",
   "token_dim": 3072

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5f68193b835a1d5b176c3956ad28fbc4d6e51dceca4366310266bb79369392a6
-size 430750193

 version https://git-lfs.github.com/spec/v1
+oid sha256:3181e3dc62e83e655d1bf550e1aed15fb425eb74fbda084d595f81c533750c21
+size 326557137

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c7d14620493647deec0828464d681f2998a90c48007d220f9351a5c81a2bd0c4
-size 7921324520

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8c97b1070649166c0547cade33800fdd2572cf65871424ea30fd4942e647d94
+size 7867763280

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c6619c2b6033f4d4af46ad40b9303ab2bce7e6049dcea2a5e3d401732a24db1b
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:6df9ed098d2194169ec196f89c95288beae3d5dfe9868d67bc0caaefb114c25d
 size 5176