Yurun Yuan
RyanYr
AI & ML interests
None yet
Organizations
None yet
Collections
2
models
28
RyanYr/gemma-2-2b-it_CoT-it_SFT
Text Generation
•
Updated
•
8
RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter3
Text Generation
•
Updated
•
8
RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter2
Text Generation
•
Updated
•
46
RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter1
Text Generation
•
Updated
•
18
RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter3
Updated
•
22
RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter2
Updated
•
29
RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter1
Updated
•
28
RyanYr/reward-judge_SFT-genRM_pilot-exp
Text Generation
•
Updated
•
238
RyanYr/reward-judge_pilot-exp
Text Classification
•
Updated
•
187
RyanYr/last-letter-cat_genRM_iter1_pilot_experiment
Updated
•
36
datasets
40
RyanYr/PRM800k_trajectory_pair
Viewer
•
Updated
•
87
RyanYr/PRM800k_completion-wise_labels
Viewer
•
Updated
•
15k
•
21
RyanYr/CoT-it_SFT-train
Viewer
•
Updated
•
59.9k
RyanYr/SkunkworksAI_reasoning-0.01_self-critic-style
Viewer
•
Updated
•
29.9k
•
2
RyanYr/PRM800k_completion-wise_critiques
Viewer
•
Updated
•
15k
RyanYr/PRM800k_completion-wise
Viewer
•
Updated
•
1.02M
RyanYr/PRM800k_chosen
Viewer
•
Updated
•
88.8k
RyanYr/SkunkworksAI_reasoning-0.01
Viewer
•
Updated
•
29.9k
RyanYr/reward-judge_sft_t0.5-genRM_inf-scaling-law_pilot-exp_rewards
Viewer
•
Updated
•
456
•
1
RyanYr/reward-judge_iterBoN2-genRM_inf-scaling-law_pilot-exp_rewards
Viewer
•
Updated
•
456
•
2