-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 125 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper • 2409.12122 • Published • 1 -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 13 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 69
Collections
Discover the best community collections!
Collections including paper arxiv:2203.02155
-
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 15 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
Statistical Rejection Sampling Improves Preference Optimization
Paper • 2309.06657 • Published • 13 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 10
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11 -
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Paper • 2305.13245 • Published • 5 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 242
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper • 1701.06538 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 12 -
Language Model Evaluation Beyond Perplexity
Paper • 2106.00085 • Published
-
Proximal Policy Optimization Algorithms
Paper • 1707.06347 • Published • 3 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 15
-
DocGraphLM: Documental Graph Language Model for Information Extraction
Paper • 2401.02823 • Published • 34 -
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Paper • 2309.01131 • Published • 1
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14