L3-8B-Helium3 / README.md
inflatebot's picture
Update README.md
b8e6057 verified
|
raw
history blame
2.45 kB
---
base_model:
- inflatebot/helide-beta-r3
- inflatebot/helide-beta-r1
- inflatebot/helide-beta-r4
- inflatebot/helide-beta-r0
library_name: transformers
tags:
- mergekit
- merge
---
# L3-Helium3-8B
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
There was a problem with the Helide beta. 3 models resulted, each of which had different strengths. But they came about as a result of balancing two models.
That math wasn't quite mathing. There wasn't going to be a way to get the best of all three worlds just by tweaking a SLERP ratio.
But there were three of them.
The name was serendipity.
The layup was obscene.
But I *live* for the bit.
Helium-3 is a RP and storywriting hybrid, ultimately based on Sao10K's [Stheno](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2) and Fizzarolli's [Rosier](https://huggingface.co/Fizzarolli/L3-8b-Rosier-v1), and the culmination of the Helide project.
Combining Rosier's prose and knowledge of niche fetish with Stheno's steerability and crackling personality, Helium-3 brings the advancements of modern AI models to the Freaks™.
They'll chew you up and spit you out just as readily as they'll shower you with affection.
I'm genuinely proud of this one. This is the model I wish existed.
Thank you to [Fizzarolli](https://huggingface.co/Fizzarolli) for consulting and providing technical assistance which accelerated the second leg of this project from several weeks into a single night, and for making the Rosier model that made this possible. On several levels, H3 wouldn't have been possible without her.
### Merge Method
This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [inflatebot/helide-beta-r1](https://huggingface.co/inflatebot/helide-beta-r1) as a base.
### Models Merged
The following models were included in the merge:
* [inflatebot/helide-beta-r3](https://huggingface.co/inflatebot/helide-beta-r3)
* [inflatebot/helide-beta-r4](https://huggingface.co/inflatebot/helide-beta-r4)
* [inflatebot/helide-beta-r0](https://huggingface.co/inflatebot/helide-beta-r0)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: inflatebot/helide-beta-r4
- model: inflatebot/helide-beta-r0
- model: inflatebot/helide-beta-r3
merge_method: model_stock
base_model: inflatebot/helide-beta-r1
dtype: bfloat16
```