Papers
arxiv:2409.18786

A Survey on the Honesty of Large Language Models

Published on Sep 27
· Submitted by Chufan on Sep 30
#3 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

Honesty is a fundamental principle for aligning large language models (LLMs) with human values, requiring these models to recognize what they know and don't know and be able to faithfully express their knowledge. Despite promising, current LLMs still exhibit significant dishonest behaviors, such as confidently presenting wrong answers or failing to express what they know. In addition, research on the honesty of LLMs also faces challenges, including varying definitions of honesty, difficulties in distinguishing between known and unknown knowledge, and a lack of comprehensive understanding of related research. To address these issues, we provide a survey on the honesty of LLMs, covering its clarification, evaluation approaches, and strategies for improvement. Moreover, we offer insights for future research, aiming to inspire further exploration in this important area.

Community

Paper submitter
This comment has been hidden
Paper author
•
edited 7 days ago

We are excited to share our work with everyone: A Survey on the Honesty of Large Language Models. In this paper, we systematically review the current research works on the honesty of LLMs and offer insights for future research, aiming to contribute to the development of this field.

Paper: https://arxiv.org/pdf/2409.18786
Project Page: https://github.com/SihengLi99/LLM-Honesty-Survey

image.pngFigure 1: An illustration of an honest LLM that demonstrates both self-knowledge and self-expression.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2409.18786 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2409.18786 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2409.18786 in a Space README.md to link it from this page.

Collections including this paper 3