arxiv:2409.05283
Suyash Fulay
sfulay
AI & ML interests
NLP, CSS
Organizations
None yet
Papers
1
models
66
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-05
Updated
sfulay/zephyr-7b-dpo-full-prometheus-reward-scale-1-rpo
Updated
•
2
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-05
Updated
•
1
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-05
Updated
•
4
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-1-rpo
Updated
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-2
Updated
•
4
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo
Updated
•
3
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-01
Updated
•
1
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-1
Updated
•
2
sfulay/zephyr-7b-dpo-full-gpt-low-curriculum
Updated
•
3
datasets
None public yet