Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.18869

📑Trending Papers - September 9⃣️

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published 19 days ago • 121
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5 • 86
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4 • 85
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published 20 days ago • 81

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

about 18 hours ago

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 38
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 118
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26 • 47
Recursive Introspection: Teaching Language Model Agents How to Self-Improve

Paper • 2407.18219 • Published Jul 25 • 3

This collection is for Transformer Articles

INT-FP-QSim: Mixed Precision and Formats For Large Language Models and Vision Transformers

Paper • 2307.03712 • Published Jul 7, 2023 • 1
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters

Paper • 2408.04093 • Published Aug 7 • 4
Arcee's MergeKit: A Toolkit for Merging Large Language Models

Paper • 2403.13257 • Published Mar 20 • 20
LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19 • 51

Papers I want to read

Papers in my to-read list

about 15 hours ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 125
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 53
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 10 days ago • 75

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 10 days ago • 75

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published 11 days ago • 46
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 10 days ago • 75

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 10 days ago • 75

random interest papers

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published 11 days ago • 44
Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Paper • 2409.14683 • Published 14 days ago • 8
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published 11 days ago • 23
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 10 days ago • 75

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Paper • 2409.08513 • Published 24 days ago • 10
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Paper • 2409.08264 • Published 25 days ago • 42
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 19 days ago • 69
LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published 19 days ago • 30

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs