-
Efficient RLHF: Reducing the Memory Usage of PPO
Paper • 2309.00754 • Published • 13 -
Statistical Rejection Sampling Improves Preference Optimization
Paper • 2309.06657 • Published • 13 -
Aligning Large Multimodal Models with Factually Augmented RLHF
Paper • 2309.14525 • Published • 29 -
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2310.01798
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 82
-
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 65 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 87 -
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models
Paper • 2309.11674 • Published • 31 -
Boolformer: Symbolic Regression of Logic Functions with Transformers
Paper • 2309.12207 • Published • 11