akaistormherald's picture
Update README.md
3c2b1ab verified
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
- dpo
base_model: unsloth/zephyr-sft-bnb-4bit
datasets:
- unalignment/toxic-dpo-v0.2
---
# Uploaded model
- **Developed by:** akaistormherald
- **License:** apache-2.0
- **Finetuned from model :** unsloth/zephyr-sft-bnb-4bit
Mistral7b + SFT + 4bit DPO training with unalignment/toxic-dpo-v0.2 == ToxicMist? ☣🌫