Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2310.17157

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Transformer Arch

Checkout: https://bbycroft.net/llm and http://nlp.seas.harvard.edu/2018/04/03/attention.html

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 44
ImageNet Large Scale Visual Recognition Challenge

Paper • 1409.0575 • Published Sep 1, 2014 • 8
Sequence to Sequence Learning with Neural Networks

Paper • 1409.3215 • Published Sep 10, 2014 • 3
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Paper • 2310.17157 • Published Oct 26, 2023 • 11
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers

Paper • 2305.15805 • Published May 25, 2023 • 1
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt

Paper • 2305.11186 • Published May 17, 2023 • 1
Composable Sparse Fine-Tuning for Cross-Lingual Transfer

Paper • 2110.07560 • Published Oct 14, 2021 • 1

Context compression

In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 41
When can transformers reason with abstract symbols?

Paper • 2310.09753 • Published Oct 15, 2023 • 2
Improving Length-Generalization in Transformers via Task Hinting

Paper • 2310.00726 • Published Oct 1, 2023 • 1
In-context Autoencoder for Context Compression in a Large Language Model

Paper • 2307.06945 • Published Jul 13, 2023 • 27

Large Language Models for Compiler Optimization

Paper • 2309.07062 • Published Sep 11, 2023 • 23
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Paper • 2310.17157 • Published Oct 26, 2023 • 11
FP8-LM: Training FP8 Large Language Models

Paper • 2310.18313 • Published Oct 27, 2023 • 31
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Paper • 2310.19102 • Published Oct 29, 2023 • 10

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

Paper • 2310.08185 • Published Oct 12, 2023 • 6
The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 12
In-Context Pretraining: Language Modeling Beyond Document Boundaries

Paper • 2310.10638 • Published Oct 16, 2023 • 28

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 8

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs