Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.14057

Inference Optimization

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

Paper • 2408.01050 • Published Aug 2 • 8
Efficient Inference of Vision Instruction-Following Models with Elastic Cache

Paper • 2407.18121 • Published Jul 25 • 15
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Paper • 2407.10969 • Published Jul 15 • 20

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19 • 25
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Paper • 2407.11963 • Published Jul 16 • 43

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12 • 60
MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12 • 20
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Paper • 2407.09121 • Published Jul 12 • 5
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19 • 25

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11 • 20
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18 • 15
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

Relevant-Papers-Midterm

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19 • 18
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6 • 53
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7 • 41
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13 • 43

Text to image papers

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Paper • 2311.09257 • Published Nov 14, 2023 • 45
VideoPoet: A Large Language Model for Zero-Shot Video Generation

Paper • 2312.14125 • Published Dec 21, 2023 • 44
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 30
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

Paper • 2401.01256 • Published Jan 2 • 19

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs