File size: 1,244 Bytes
0795189
 
 
 
 
 
 
 
 
 
99e6a1a
0795189
14003f0
 
 
 
0795189
 
7dd8845
0795189
d74f9d1
0795189
d74f9d1
3b8bdcf
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- sft
base_model: meta-llama/Meta-Llama-3-8B
datasets:
- teknium/OpenHermes-2.5
- grimulkan/theory-of-mind
- grimulkan/physical-reasoning
---

# Llama3 8B Wordcel

Wordcel is a Llama3 fine-tune intended to be used as a mid-training checkpoint for more specific RP/storywriting/creative applications.

It has been trained from Llama3 8B Base on a composite dataset of ~100M tokens that highlights reasoning, (uncensored) stories, classic literature, and assorted interpersonal intelligence tasks.

Components of the composite dataset include [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5), and [Grimulkan](https://huggingface.co/grimulkan)'s [Theory of Mind](https://huggingface.co/datasets/grimulkan/theory-of-mind) and [Physical Reasoning](https://huggingface.co/datasets/grimulkan/physical-reasoning) datasets.

It is trained at a context length of 32k tokens, using linear RoPE scaling with a factor of 4.0. Derivative models should be capable of generalizing to 32k tokens as a result.

If you train a model using this checkpoint, please give clear attribution! The Llama 3 base license likely applies.