-
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Paper • 2409.02897 • Published • 44 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Paper • 2405.19327 • Published • 46
Collections
Discover the best community collections!
Collections including paper arxiv:2405.19327
-
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 64 -
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper • 2406.06469 • Published • 23 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 27 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 36
-
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Paper • 2405.19327 • Published • 46 -
LLM360/K2
Text Generation • Updated • 615 • 80 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 80 -
LLM360: Towards Fully Transparent Open-Source LLMs
Paper • 2312.06550 • Published • 56
-
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 90 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60
-
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
Paper • 2306.00107 • Published • 3 -
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
Paper • 2309.08730 • Published • 1 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 56 -
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark
Paper • 2401.11944 • Published • 27
-
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Paper • 2311.00059 • Published • 18 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 39 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57
-
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 5 -
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Paper • 2309.08958 • Published • 2 -
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Paper • 2305.04160 • Published • 2 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1