Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
DUAL-GPO
/
zephyr-7b-dpo-0k-15k-i1-merged
like
0
Follow
DUAL Group
2
PEFT
Safetensors
HuggingFaceH4/ultrafeedback_binarized
mistral
alignment-handbook
Generated from Trainer
trl
dpo
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Use this model
main
zephyr-7b-dpo-0k-15k-i1-merged
Commit History
Update README.md
1106fd7
verified
lole25
commited on
Sep 20
Update README.md
d1bd63c
verified
lole25
commited on
Sep 20
Create README.md
953c86e
verified
lole25
commited on
Sep 20
Upload tokenizer
c07d473
verified
BraylonDash
commited on
Sep 19
Upload MistralForCausalLM
634773f
verified
BraylonDash
commited on
Sep 19
initial commit
ade2316
verified
BraylonDash
commited on
Sep 19