Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published 16 days ago • 70
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks Paper • 2403.04783 • Published Mar 2 • 2