language: | |
- en | |
license: apache-2.0 | |
tags: | |
- text-generation-inference | |
- transformers | |
- unsloth | |
- mistral | |
- trl | |
- dpo | |
base_model: unsloth/zephyr-sft-bnb-4bit | |
datasets: | |
- unalignment/toxic-dpo-v0.2 | |
# Uploaded model | |
- **Developed by:** akaistormherald | |
- **License:** apache-2.0 | |
- **Finetuned from model :** unsloth/zephyr-sft-bnb-4bit | |
Mistral7b + SFT + 4bit DPO training with unalignment/toxic-dpo-v0.2 == ToxicMist? ☣🌫 |