paul-stansifer/llama3-qwantz-coherent

Browse files

Files changed (5) hide show

README.md +5 -36
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/May03_18-12-38_048cd167e598/events.out.tfevents.1714759959.048cd167e598.211.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -9,7 +9,6 @@ metrics:
 model-index:
 - name: llama3-qwantz-coherent
   results: []
-pipeline_tag: text-classification
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,10 +16,10 @@ should probably proofread and complete it, then remove this comment. -->
 # llama3-qwantz-coherent
-This model is a fine-tuned version of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2537
-- Accuracy: 0.9279
 ## Model description
@@ -51,38 +50,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.2581        | 1.0   | 1016 | 0.2537          | 0.9279   |
-```
-Can save 90% of coherent strings by discarding 94% of dp strings (cutoff is 75.22732615470886)
-Can save 95% of coherent strings by discarding 91% of dp strings (cutoff is -64.16597366333008)
-Can save 98% of coherent strings by discarding 86% of dp strings (cutoff is -93.8580572605133)
-Can save 99% of coherent strings by discarding 78% of dp strings (cutoff is -99.2882251739502)
-I have constructed a rocket-ship for myself  ==>  coherent: 99.92%
-I have constructed a refund for a  ==>  dp: 98.67%
-Descartes was a dude who wrote "Cogito ergo sum" which means "I think, therefore I am". PRETTY  ==>  coherent: 100.00%
-Descartes was a dude who wrote "Cogito ergo sum" which means "I think, therefore finite lifetimes the  ==>  dp: 99.99%
-That's certainly one way of looking at it, right, Dromiceiomimums?  ==>  coherent: 100.00%
-That's certainly one way of looking at it, is the  ==>  dp: 98.15%
-I'm here to pick up my prescription "Happy New year 2004" glasses! They have a plastic "2" on  ==>  coherent: 99.79%
-I'm here to pick up my prescription "Happy New year 2004" glasses! They have come into cartoon stereotypes  ==>  dp: 99.99%
-I didn't mean for that to be  ==>  coherent: 99.13%
-I didn't mean for the police officer  ==>  dp: 64.15%
-You know what would go down if Nintendo came over?  ==>  coherent: 100.00%
-You know what would go down if Nintendo i live  ==>  dp: 100.00%
-"Aw shucks! I guess it IS true that you're never too  ==>  coherent: 100.00%
-"Aw shucks! I guess it IS true that worse the turmeric  ==>  dp: 100.00%
-Is it true that the only questions worth asking are those that  ==>  coherent: 99.98%
-Is it true that the only questions worth preserving if i've been  ==>  dp: 99.99%
-What? No, he was in pieces. His hand even landed in  ==>  coherent: 99.44%
-What? No, he was in pieces. His gun that an excellent  ==>  dp: 99.99%
-Also, many of the signs are really evocative, so they're easy to  ==>  coherent: 100.00%
-Also, many of the signs are really evocative, so they're approved aaargh  ==>  dp: 100.00%
-Another beautiful hot day! I look forward to these "dog days"  ==>  coherent: 99.97%
-Another beautiful hot day! I look forward to return to make  ==>  dp: 99.81%
-```
 ### Framework versions

 model-index:
 - name: llama3-qwantz-coherent
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # llama3-qwantz-coherent
+This model is a fine-tuned version of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3295
+- Accuracy: 0.8758
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.4482        | 1.0   | 1428 | 0.3295          | 0.8758   |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,10 +20,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "o_proj",
     "q_proj",
     "v_proj",
-    "k_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
     "v_proj",
+    "k_proj",
+    "o_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c55ad7da58365c88bee2a9e6e3c7bdf27dccf8bdadf924dd33ba3a08f2bfcd55
 size 54593240

 version https://git-lfs.github.com/spec/v1
+oid sha256:aafbb760ed8b904e0cf8d1c3726ac2d22e6f65940f9e2b06b486f66dc54de260
 size 54593240

runs/May03_18-12-38_048cd167e598/events.out.tfevents.1714759959.048cd167e598.211.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:446fb0a0caac3cfa4a41f0b30f0e82fd25cde23bb53e72329bc4884f97d0341c
+size 17949

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a939bbd485602e127542064e72f1117ffe82674046ddf744d500e494cfd4c33f
 size 5048

 version https://git-lfs.github.com/spec/v1
+oid sha256:73f9fa2fbada4e354ca6f78290205d9ec5eecfada10232d0a46295520fe058ef
 size 5048