NeverSleep
/

MiquMaid-v2-2x70B-DPO

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MiquMaid-v2-2x70B-DPO / README.md

Undi95's picture

Update README.md

30e44c4 verified 9 months ago

|

history blame contribute delete

2.42 kB

	---
	license: cc-by-nc-4.0
	tags:
	- not-for-all-audiences
	- nsfw
	---

	## MiquMaid v2 2x70 DPO

	Check out our blogpost about this model series [Here!](https://ikaridevgit.github.io/index.html?blog=blogid-6&bo=true#Miqu-base) - Join our Discord server [Here!](https://discord.gg/Bb8pRUXy3Z)

	<center>[<a href="https://huggingface.co/NeverSleep/MiquMaid-v2-70B">V2-70B</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-70B-DPO">V2-70B-DPO</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B">V2-2x70B</a> - <a href="https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO">V2-2x70B-DPO</a>]
	</br>
	<div style="width: 100%;">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/Wbzwoko-IZbOJfvPaImre.png" style="display: block; margin: auto;">
	</div></center>

	This model uses the Alpaca prompting format

	Then, we have done a MoE, made of MiquMaid-v2-70B-DPO and Miqu-70B-DPO base, making the model using the finetune AND the base model for each token, working together.

	The two model have been trained on DPO for uncensoring, more info on Miqu-70B-DPO [here](https://huggingface.co/Undi95/Miqu-70B-Alpaca-DPO-GGUF)

	We have seen a significant improvement, so we decided to share that, even if the model is very big.

	## Credits:
	- Undi
	- IkariDev

	## Description

	This repo contains FP16 files of MiquMaid-v2-2x70B-DPO.

	Switch: [FP16](https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO) - [GGUF](https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO-GGUF)

	## Training data used:
	- [Aesir datasets](https://huggingface.co/MinervaAI)
	- [NoRobots](https://huggingface.co/datasets/Doctor-Shotgun/no-robots-sharegpt)
	- [limarp](https://huggingface.co/datasets/lemonilia/LimaRP)
	- [toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt)
	- [ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)

	## DPO training data used:
	- [ToxicDPOqa](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicDPOqa)
	- [toxic-dpo-v0.1-NoWarning](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-NoWarning)

	### Custom format:
	```
	### Instruction:
	{system prompt}

	### Input:
	{input}

	### Response:
	{reply}
	```

	## Others

	Undi: If you want to support us, you can [here](https://ko-fi.com/undiai).

	IkariDev: Visit my [retro/neocities style website](https://ikaridevgit.github.io/) please kek