ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_helpfulness Viewer • Updated Jun 12 • 60.9k • 45
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_truthfulness Viewer • Updated Jun 12 • 60.9k • 43
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_instruction_following Viewer • Updated Jun 12 • 60.9k • 40 • 3
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_honesty Viewer • Updated Jun 12 • 60.9k • 36
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia6.9b Viewer • Updated Jun 20 • 177k • 58
yaswanthchittepu/ultrafeedback-binarized-standard-margin-data-full Viewer • Updated Jul 7 • 63.7k • 41
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia1b Viewer • Updated May 16 • 177k • 44
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887192 Viewer • Updated Feb 2 • 405 • 38
argilla/ultrafeedback-multi-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 158k • 193 • 6
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 78
ShenaoZ/0.001_3iters_bs128_declr_nodpo_zephyrbeta_userresponse_dataset Viewer • Updated Apr 26 • 67.1k • 35
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1707245027 Viewer • Updated Feb 7 • 1M • 120
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_lora-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.8k • 34
giux78/50000-60900-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 10.9k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3 Viewer • Updated Jun 18 • 21.1k • 38
alvarobartt/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 20, 2023 • 155k • 40
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 185
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 98
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 141
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 112
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 56
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 568k • 120
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 48
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 48
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 94.6k • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.9 Viewer • Updated Mar 26 • 568k • 76
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 39
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_minpi_part_3 Viewer • Updated Jun 18 • 21.1k • 42
reshinthadith/pairwise-code-review-instruct-critique-revision-python Viewer • Updated Jan 9, 2023 • 5.24k • 148 • 7
NickyNicky/neovalle_H4rmony_dpo_translated_English_to_Spanish Viewer • Updated May 17 • 2.02k • 43 • 4
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330973 Viewer • Updated Feb 7 • 167 • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 14 • 37.9k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 21 • 37.9k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 21 • 568k • 158
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 23 • 568k • 103
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 125
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 125
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 203
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 130
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 146
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 101
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 24 • 189k • 59
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 24 • 189k • 59
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 568k • 171
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 568k • 176
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 45
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 59
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 66
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.25 Viewer • Updated Mar 26 • 568k • 81
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.9 Viewer • Updated Mar 27 • 568k • 221
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.9 Viewer • Updated Mar 27 • 568k • 118
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 78
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_3 Viewer • Updated May 9 • 4.85k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_1 Viewer • Updated May 20 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_2 Viewer • Updated May 20 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 32
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 40
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 38
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_3 Viewer • Updated Jun 17 • 5.29k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_3 Viewer • Updated Jun 18 • 5.29k • 38
y1xing/orpo_llama3_concatenated_data_with_chris_examples_orpo_instruct_dataset Viewer • Updated Jul 6 • 2.64k • 36
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 43 • 4
NickyNicky/DIBT_prompts_ranked_En_Es_orpo_dpo_chatML_gemma_V3 Viewer • Updated May 14 • 20.4k • 36 • 1
giux78/10000-20000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 16 • 10k • 46
giux78/20000-50000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 30k • 41
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885434 Viewer • Updated Feb 2 • 24 • 42
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706903049 Viewer • Updated Feb 2 • 167 • 40
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331096 Viewer • Updated Feb 7 • 87 • 54
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331527 Viewer • Updated Feb 7 • 462 • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 14 • 37.9k • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_160m Viewer • Updated Mar 14 • 37.9k • 50
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.3_self_160m Viewer • Updated Mar 21 • 37.9k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_1.0_self_160m Viewer • Updated Mar 21 • 18.9k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 22 • 568k • 195
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 21 • 568k • 138
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 145
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 21 • 568k • 201
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 135
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 21 • 568k • 75
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 96
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 22 • 568k • 177
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 24 • 568k • 160
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 96
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 71
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 118
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 89
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 150
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 79
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 124
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 102
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 511k • 132
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 189k • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 189k • 74
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 189k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 189k • 51
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 49
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 189k • 55
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 50
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 69
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 46
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 82
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 81
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 80
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 94
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 73
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 568k • 142
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.25 Viewer • Updated Mar 26 • 568k • 72
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.75 Viewer • Updated Mar 26 • 568k • 125
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.9 Viewer • Updated Mar 26 • 568k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.7 Viewer • Updated Mar 27 • 568k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.5 Viewer • Updated Mar 27 • 568k • 106
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0_eval Viewer • Updated Mar 28 • 568k • 176
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0_eval Viewer • Updated Mar 29 • 568k • 76
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0_eval Viewer • Updated Mar 29 • 568k • 205
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 114
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 232
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7 Viewer • Updated Apr 7 • 20k • 39
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 38
mnoukhov/summarize_from_feedback_tldr3_labelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 37
ShenaoZhang/0.0001_3iters_bs256_nodpo_full6w_userresponse_dataset Viewer • Updated Apr 29 • 46.8k • 63
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799_new Viewer • Updated May 5 • 20k • 36
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_3 Viewer • Updated May 6 • 4.9k • 36
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_2 Viewer • Updated May 6 • 4.9k • 42
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_3 Viewer • Updated May 6 • 5.19k • 36
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_1 Viewer • Updated May 6 • 5.18k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_2 Viewer • Updated May 7 • 4.1k • 41
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_0 Viewer • Updated May 7 • 4.78k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_3 Viewer • Updated May 8 • 5.09k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_2 Viewer • Updated May 8 • 4.4k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_1 Viewer • Updated May 8 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_2 Viewer • Updated May 8 • 19.4k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_2 Viewer • Updated May 9 • 4.85k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_1 Viewer • Updated May 9 • 4.85k • 43
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_0 Viewer • Updated May 9 • 5.16k • 33
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_1 Viewer • Updated May 9 • 5.16k • 36
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr Viewer • Updated May 17 • 107k • 43
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873 Viewer • Updated May 12 • 20k • 41
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_3 Viewer • Updated May 20 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_2 Viewer • Updated May 20 • 5k • 39
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 45
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v2_full-sft-finetuned-stage4-iter86000-v2 Viewer • Updated May 23 • 18.8k • 37
BahaaEldin0/openai_summarize_comparisons_dataset_with_prompts_2_percent Viewer • Updated May 30 • 4.69k • 54
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_0 Viewer • Updated Jun 17 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_2 Viewer • Updated Jun 17 • 5.28k • 32
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_0 Viewer • Updated Jun 18 • 5.28k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_1 Viewer • Updated Jun 18 • 5.28k • 36
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_2 Viewer • Updated Jun 18 • 5.28k • 39
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel2_llama8b Viewer • Updated Jun 19 • 92.1k • 37
giux78/ultrafeedback-binarized-preferences-cleaned-ita-ready Viewer • Updated Jan 18 • 60.9k • 40 • 2
NickyNicky/Colossal_Translation_Spanish_to_English_AND_English_to_Spanish_ORPO_DPO_Gemma Viewer • Updated May 6 • 3.4M • 114 • 3
arianhosseini/openai_summarize_comparisons_relabel_pythia1b_iter1_temp0.7 Viewer • Updated Dec 22, 2023 • 20k • 39
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885528 Viewer • Updated Feb 2 • 24 • 44
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706886961 Viewer • Updated Feb 2 • 24 • 39
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887930 Viewer • Updated Feb 2 • 30 • 43
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706893611 Viewer • Updated Feb 2 • 84 • 42
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706896441 Viewer • Updated Feb 2 • 5 • 40
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330518 Viewer • Updated Feb 7 • 167 • 48
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330742 Viewer • Updated Feb 7 • 167 • 39
mnoukhov/openai_summarize_comparisons_tldprompt_relabel_pythia410m-dpo1 Viewer • Updated Feb 19 • 92.5k • 36
mnoukhov/openai_summarize_comparisons_tldrprompt_relabel1b_margin Viewer • Updated Feb 22 • 97.5k • 38
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 39
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 42
mnoukhov/openai_summarize_generated_20k_relabel_1b_predict_410m-dpo1 Viewer • Updated Feb 26 • 20k • 34
davidberenstein1957/ultrafeedback-binarized-cleaned-and-filtered-random-split Viewer • Updated Mar 14 • 6.69k • 69
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 14 • 37.9k • 44
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 14 • 37.9k • 39
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 164
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 137
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 86
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 134
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 84
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 568k • 170
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 45
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 56
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 70
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 91
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Apr 19 • 568k • 126
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 117
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 80
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 136
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 110
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 26 • 94.6k • 45
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 26 • 94.6k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.5 Viewer • Updated Mar 26 • 568k • 63
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.9 Viewer • Updated Mar 26 • 568k • 58
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.5 Viewer • Updated Mar 26 • 568k • 185
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.5 Viewer • Updated Mar 26 • 568k • 180
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 27 • 568k • 49
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 27 • 568k • 71
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 27 • 568k • 171
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.1 Viewer • Updated Mar 27 • 568k • 139
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.3 Viewer • Updated Mar 27 • 568k • 156
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.1 Viewer • Updated Mar 27 • 568k • 134
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.3 Viewer • Updated Mar 27 • 568k • 153
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.7 Viewer • Updated Mar 27 • 568k • 210
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.5 Viewer • Updated Mar 27 • 568k • 72
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.9 Viewer • Updated Mar 27 • 568k • 89
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0_eval Viewer • Updated Mar 28 • 568k • 153
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 126
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 149
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 169
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b35a8 Viewer • Updated Apr 16 • 20k • 37
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 18 • 20k • 33
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 107k • 37
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799 Viewer • Updated Apr 22 • 107k • 33
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_0 Viewer • Updated Apr 24 • 10k • 35
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_1 Viewer • Updated Apr 24 • 10k • 35
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_2 Viewer • Updated Apr 24 • 10k • 34
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_3 Viewer • Updated Apr 24 • 10k • 36
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_4 Viewer • Updated Apr 24 • 10k • 33
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_5 Viewer • Updated Apr 24 • 10k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Apr 26 • 303k • 108
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Apr 26 • 303k • 97
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 59
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_5 Viewer • Updated Apr 26 • 303k • 93
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 79
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped Viewer • Updated May 2 • 23.3k • 46
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_0 Viewer • Updated May 6 • 4.9k • 37
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_2 Viewer • Updated May 6 • 4.9k • 34
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_1 Viewer • Updated May 6 • 4.9k • 35
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_1 Viewer • Updated May 6 • 4.9k • 34
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_0 Viewer • Updated May 6 • 5.18k • 40
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_2 Viewer • Updated May 6 • 5.18k • 35
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_2 Viewer • Updated May 7 • 4.78k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_3 Viewer • Updated May 7 • 4.78k • 37
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_1 Viewer • Updated May 7 • 4.78k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_0 Viewer • Updated May 8 • 5.28k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_1 Viewer • Updated May 8 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_0 Viewer • Updated May 8 • 5.18k • 39
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_2 Viewer • Updated May 8 • 5.18k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_3 Viewer • Updated May 8 • 5.19k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_0 Viewer • Updated May 8 • 5k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2 Viewer • Updated May 9 • 19.4k • 41
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_2 Viewer • Updated May 9 • 4.98k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_3 Viewer • Updated May 9 • 5.09k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_1 Viewer • Updated May 9 • 5.28k • 40
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_0 Viewer • Updated May 9 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_3 Viewer • Updated May 9 • 20.6k • 41
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_3 Viewer • Updated May 9 • 5.16k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_2 Viewer • Updated May 9 • 5.16k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3 Viewer • Updated May 9 • 20.6k • 37
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped_add_generated_text Viewer • Updated May 14 • 12k • 97
GENIAC-Team-Ozaki/chatbot-arena-ja-karakuri-lm-8x7b-chat-v0.1-awq Viewer • Updated May 17 • 12.5k • 37
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr_relabel_pythia1b Viewer • Updated May 17 • 107k • 42
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_1 Viewer • Updated May 20 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_3 Viewer • Updated May 20 • 5k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 38
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v3_full-sft-finetuned-stage4-iter86000-v3 Viewer • Updated May 24 • 19.3k • 35
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v4_full-sft-finetuned-stage4-iter86000-v4 Viewer • Updated May 25 • 19.5k • 42
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 34
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_llama8b Viewer • Updated Jun 19 • 176k • 34
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706888126 Viewer • Updated Feb 2 • 84 • 33
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__temp Viewer • Updated Feb 6 • 600k • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 14 • 37.9k • 37
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 98
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 136
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 22 • 568k • 93
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 73
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 175
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 568k • 109
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 63
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Apr 19 • 568k • 171
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 135
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 81
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.75 Viewer • Updated Mar 26 • 568k • 91
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.3 Viewer • Updated Mar 27 • 568k • 136
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.5 Viewer • Updated Mar 27 • 568k • 148
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 129
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 92
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 289
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7 Viewer • Updated Apr 8 • 20k • 35
ShenaoZ/0.001_4iters_bs256_nodpo_only2third_userresponse_dataset Viewer • Updated Apr 26 • 12.2k • 37
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_3 Viewer • Updated May 6 • 4.9k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_0 Viewer • Updated May 20 • 5k • 32
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_1 Viewer • Updated Jun 17 • 5.28k • 36
mnoukhov/openai_summarize_generated_20k_relabel_pythia410m-dpo1_margin Viewer • Updated Feb 22 • 20k • 73
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_v4 Viewer • Updated Mar 11 • 2.83k • 34
aengusl/noise5_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 86
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.75 Viewer • Updated Mar 26 • 568k • 165
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.25 Viewer • Updated Mar 26 • 568k • 101
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.1 Viewer • Updated Mar 27 • 568k • 95
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.7 Viewer • Updated Mar 27 • 568k • 94
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0_eval Viewer • Updated Mar 28 • 568k • 108
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 39
mnoukhov/summarize_from_feedback_tldr3_labelled_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 35
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo2_costa_1b_fp16.yml_bfcef Viewer • Updated Apr 21 • 107k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Apr 26 • 303k • 56
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_0 Viewer • Updated May 6 • 4.9k • 38
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_2 Viewer • Updated May 8 • 5.08k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_1 Viewer • Updated May 8 • 5.18k • 34
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_3 Viewer • Updated May 8 • 5k • 34
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873_relabel_pythia1b Viewer • Updated May 13 • 20k • 40
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_0 Viewer • Updated May 20 • 5k • 34
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_full-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.3k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 15 • 37.9k • 246
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 18 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 18 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_160m Updated Mar 21 • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_160m Updated Mar 18 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 15 • 37.9k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 18 • 189k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_160m Viewer • Updated Mar 18 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 19 • 189k • 218
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 19 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 19 • 189k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_160m Updated Mar 19 • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_160m Updated Mar 19 • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_160m Updated Mar 19 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.3_self_160m Updated Mar 21 • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_1.0_self_160m Updated Mar 21 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Apr 19 • 568k • 34
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0_eval Viewer • Updated Mar 29 • 568k • 32
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_0 Viewer • Updated May 9 • 4.85k • 35
ContextualAI/ultrabin_clean_max_chosen_rand_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 36