RyanYr (Yurun Yuan)

Collections 2

models 28

datasets 40

RyanYr/PRM800k_trajectory_pair

Viewer • Updated 20 minutes ago • 87

RyanYr/PRM800k_completion-wise_labels

Viewer • Updated about 17 hours ago • 15k • 21

RyanYr/CoT-it_SFT-train

Viewer • Updated 1 day ago • 59.9k

RyanYr/SkunkworksAI_reasoning-0.01_self-critic-style

Viewer • Updated 1 day ago • 29.9k • 2

RyanYr/PRM800k_completion-wise_critiques

Viewer • Updated 3 days ago • 15k

RyanYr/PRM800k_completion-wise

Viewer • Updated 4 days ago • 1.02M

RyanYr/PRM800k_chosen

Viewer • Updated 4 days ago • 88.8k

RyanYr/SkunkworksAI_reasoning-0.01

Viewer • Updated 6 days ago • 29.9k

RyanYr/reward-judge_sft_t0.5-genRM_inf-scaling-law_pilot-exp_rewards

Viewer • Updated 6 days ago • 456 • 1

RyanYr/reward-judge_iterBoN2-genRM_inf-scaling-law_pilot-exp_rewards

Viewer • Updated 7 days ago • 456 • 2

Yurun Yuan

AI & ML interests

Organizations

Collections 2

peiyi9979/Math-Shepherd

Idavidrein/gpqa

models 28

RyanYr/gemma-2-2b-it_CoT-it_SFT

RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter3

RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter2

RyanYr/reward-judge_iter-sft-genRM_pilot-exp_iter1

RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter3

RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter2

RyanYr/reward-judge_iter-dpo-genRM_pilot-exp_iter1

RyanYr/reward-judge_SFT-genRM_pilot-exp

RyanYr/reward-judge_pilot-exp

RyanYr/last-letter-cat_genRM_iter1_pilot_experiment

datasets 40

RyanYr/PRM800k_trajectory_pair

RyanYr/PRM800k_completion-wise_labels

RyanYr/CoT-it_SFT-train

RyanYr/SkunkworksAI_reasoning-0.01_self-critic-style

RyanYr/PRM800k_completion-wise_critiques

RyanYr/PRM800k_completion-wise

RyanYr/PRM800k_chosen

RyanYr/SkunkworksAI_reasoning-0.01

RyanYr/reward-judge_sft_t0.5-genRM_inf-scaling-law_pilot-exp_rewards

RyanYr/reward-judge_iterBoN2-genRM_inf-scaling-law_pilot-exp_rewards

Yurun Yuan

AI & ML interests

Organizations

Collections 2

models 28 Sort: Recently updated

datasets 40 Sort: Recently updated

models 28

datasets 40