-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 82
Collections
Discover the best community collections!
Collections including paper arxiv:2310.17157
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
ImageNet Large Scale Visual Recognition Challenge
Paper • 1409.0575 • Published • 8 -
Sequence to Sequence Learning with Neural Networks
Paper • 1409.3215 • Published • 3 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11
-
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper • 2310.17157 • Published • 11 -
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Paper • 2305.15805 • Published • 1 -
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Paper • 2305.11186 • Published • 1 -
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Paper • 2110.07560 • Published • 1
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
When can transformers reason with abstract symbols?
Paper • 2310.09753 • Published • 2 -
Improving Length-Generalization in Transformers via Task Hinting
Paper • 2310.00726 • Published • 1 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 27
-
Large Language Models for Compiler Optimization
Paper • 2309.07062 • Published • 23 -
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper • 2310.17157 • Published • 11 -
FP8-LM: Training FP8 Large Language Models
Paper • 2310.18313 • Published • 31 -
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Paper • 2310.19102 • Published • 10
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation
Paper • 2310.08185 • Published • 6 -
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 12 -
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper • 2310.10638 • Published • 28
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8