|
--- |
|
license: cc-by-nc-4.0 |
|
tags: |
|
- not-for-all-audiences |
|
- nsfw |
|
--- |
|
|
|
## MiquMaid v2 2x70 DPO |
|
|
|
Check out our blogpost about this model series [Here!](https://ikaridevgit.github.io/index.html?blog=blogid-6&bo=true#Miqu-base) - Join our Discord server [Here!](https://discord.gg/Bb8pRUXy3Z) |
|
|
|
<center>[<a href="https://huggingface.co/NeverSleep/MiquMaid-v2-70B">V2-70B</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-70B-DPO">V2-70B-DPO</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B">V2-2x70B</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO">V2-2x70B-DPO</a>] |
|
</br> |
|
<div style="width: 100%;"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/Wbzwoko-IZbOJfvPaImre.png" style="display: block; margin: auto;"> |
|
</div></center> |
|
|
|
This model uses the Alpaca **prompting format** |
|
|
|
Then, we have done a MoE, made of MiquMaid-v2-70B-DPO and Miqu-70B-DPO base, making the model using the finetune AND the base model for each token, working together. |
|
|
|
The two model have been trained on DPO for uncensoring, more info on Miqu-70B-DPO [here](https://huggingface.co/Undi95/Miqu-70B-Alpaca-DPO-GGUF) |
|
|
|
We have seen a significant improvement, so we decided to share that, even if the model is very big. |
|
|
|
## Credits: |
|
- Undi |
|
- IkariDev |
|
|
|
## Description |
|
|
|
This repo contains FP16 files of MiquMaid-v2-2x70B-DPO. |
|
|
|
Switch: [FP16](https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO) - [GGUF](https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO-GGUF) |
|
|
|
## Training data used: |
|
- [Aesir datasets](https://huggingface.co/MinervaAI) |
|
- [NoRobots](https://huggingface.co/datasets/Doctor-Shotgun/no-robots-sharegpt) |
|
- [limarp](https://huggingface.co/datasets/lemonilia/LimaRP) |
|
- [toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt) |
|
- [ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal) |
|
|
|
## DPO training data used: |
|
- [ToxicDPOqa](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicDPOqa) |
|
- [toxic-dpo-v0.1-NoWarning](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-NoWarning) |
|
|
|
### Custom format: |
|
``` |
|
### Instruction: |
|
{system prompt} |
|
|
|
### Input: |
|
{input} |
|
|
|
### Response: |
|
{reply} |
|
``` |
|
|
|
## Others |
|
|
|
Undi: If you want to support us, you can [here](https://ko-fi.com/undiai). |
|
|
|
IkariDev: Visit my [retro/neocities style website](https://ikaridevgit.github.io/) please kek |