Delta-Vector
commited on
Commit
•
c1262f1
1
Parent(s):
a00daa6
Update README.md
Browse files
README.md
CHANGED
@@ -20,8 +20,10 @@ tags:
|
|
20 |
- chat
|
21 |
---
|
22 |
|
|
|
23 |
|
24 |
-
|
|
|
25 |
|
26 |
# Quants
|
27 |
|
@@ -101,7 +103,7 @@ load_in_4bit: false
|
|
101 |
strict: false
|
102 |
|
103 |
datasets:
|
104 |
-
- path:
|
105 |
type: sharegpt
|
106 |
conversation: chatml
|
107 |
- path: NewEden/Claude-Instruct-5K
|
@@ -203,11 +205,11 @@ special_tokens:
|
|
203 |
- [Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned)
|
204 |
- [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
|
205 |
- [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
|
206 |
-
- [
|
207 |
|
208 |
|
209 |
## Training
|
210 |
-
The training was done for
|
211 |
|
212 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
213 |
|
|
|
20 |
- chat
|
21 |
---
|
22 |
|
23 |
+
![](https://huggingface.co/Delta-Vector/Odin-9B/resolve/main/FinalOdin9B.jpg)
|
24 |
|
25 |
+
|
26 |
+
A earlier checkpoint of an unreleased (for now) model, using the same configuration as [Tor-8B]() but on Gemma rather then Nemo-8B, A finetune made for creative writing and roleplay tasks, Finetuned ontop of the base Gemma2 9B model, I trained the model for 4 epochs, with the 4 epoch checkpoint becoming the a future unreleased model and the 2 epoch checkpoint becoming my own personal release. This model aims to have good prose and writing while not as `Suggestive` as Magnum models usually are, along with keeping some of the intelligence that was nice to have with the Gemma2 family.
|
27 |
|
28 |
# Quants
|
29 |
|
|
|
103 |
strict: false
|
104 |
|
105 |
datasets:
|
106 |
+
- path: [PRIVATE CLAUDE LOG FILTER]
|
107 |
type: sharegpt
|
108 |
conversation: chatml
|
109 |
- path: NewEden/Claude-Instruct-5K
|
|
|
205 |
- [Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned)
|
206 |
- [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
|
207 |
- [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
|
208 |
+
- [Private re-Filter of Claude Logs](https://google.com)
|
209 |
|
210 |
|
211 |
## Training
|
212 |
+
The training was done for 4 epochs. We used 8 x [H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Lucy Knada](https://huggingface.co/lucyknada) for the full-parameter fine-tuning of the model.
|
213 |
|
214 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
215 |
|