OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs
Abstract
Despite the recent advancements in Large Language Models (LLMs), which have significantly enhanced the generative capabilities for various NLP tasks, LLMs still face limitations in directly handling retrieval tasks. However, many practical applications demand the seamless integration of both retrieval and generation. This paper introduces a novel and efficient One-pass Generation and retrieval framework (OneGen), designed to improve LLMs' performance on tasks that require both generation and retrieval. The proposed framework bridges the traditionally separate training approaches for generation and retrieval by incorporating retrieval tokens generated autoregressively. This enables a single LLM to handle both tasks simultaneously in a unified forward pass. We conduct experiments on two distinct types of composite tasks, RAG and Entity Linking, to validate the pluggability, effectiveness, and efficiency of OneGen in training and inference. Furthermore, our results show that integrating generation and retrieval within the same context preserves the generative capabilities of LLMs while improving retrieval performance. To the best of our knowledge, OneGen is the first to enable LLMs to conduct vector retrieval during the generation.
Community
This paper introduces a novel and efficient One-pass Generation and retrieval framework (OneGen), designed to improve LLMs' performance on tasks that require both generation and retrieval.
Hi @Ningyu congrats on your work! Opened an issue to improve the discoverability: https://github.com/zjunlp/OneGen/issues/4.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Efficient and Scalable Estimation of Tool Representations in Vector Space (2024)
- ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning (2024)
- RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation (2024)
- Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval (2024)
- Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper