HILL: A Hallucination Identifier for Large Language Models

被引：3

作者：

Leiser, Florian ^{[1
]}

Eckhardt, Sven ^{[2
]}

Leuthe, Valentin ^{[1
]}

Knaeble, Merlin ^{[3
]}

Maedche, Alexander ^{[3
]}

Schwabe, Gerhard ^{[2
]}

Sunyaev, Ali ^{[1
]}

机构：

[1] Karlsruhe Inst Technol, Inst Appl Informat & Formal Descript Methods, Karlsruhe, Germany

[2] Univ Zurich, Dept Informat, Zurich, Switzerland

[3] Karlsruhe Inst Technol, Human Ctr Syst Lab, Karlsruhe, Germany

来源：

PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024) | 2024年

关键词：

ChatGPT; Large Language Models; Artificial Hallucinations; Wizard of Oz; Artifact Development; AUTOMATION; WIZARD; OZ;

D O I：

10.1145/3613904.3642428

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large language models (LLMs) are prone to hallucinations, i.e., non-sensical, unfaithful, and undesirable text. Users tend to overrely on LLMs and corresponding hallucinations which can lead to misinterpretations and errors. To tackle the problem of overreliance, we propose HILL, the "Hallucination Identifier for Large Language Models". First, we identified design features for HILL with a Wizard of Oz approach with nine participants. Subsequently, we implemented HILL based on the identified design features and evaluated HILL's interface design by surveying 17 participants. Further, we investigated HILL's functionality to identify hallucinations based on an existing question-answering dataset and five user interviews. We find that HILL can correctly identify and highlight hallucinations in LLM responses which enables users to handle LLM responses with more caution. With that, we propose an easy-to-implement adaptation to existing LLMs and demonstrate the relevance of user-centered designs of AI artifacts.

引用

页数：13

共 50 条

[21] Large Language Models and the Future of Organization Theory
Cornelissen, Joep
Hollerer, Markus A.
Boxenbaum, Eva
Faraj, Samer
Gehman, Joel
[J]. ORGANIZATION THEORY, 2024, 5 (01):
[22] Large language models and their big bullshit potential
Fisher, Sarah A.
[J]. ETHICS AND INFORMATION TECHNOLOGY, 2024, 26 (04)
[23] Using Large Language Models to Improve Sentiment Analysis in Latvian Language
Purvins, Pauls
Urtans, Evalds
Caune, Vairis
[J]. BALTIC JOURNAL OF MODERN COMPUTING, 2024, 12 (02): : 165 - 175
[24] Hallucination Reduction and Optimization for Large Language Model-Based Autonomous Driving
Wang, Jue
[J]. SYMMETRY-BASEL, 2024, 16 (09):
[25] Generative AI, Large Language Models, and ChatGPT in Construction Education, Training, and Practice
Jelodar, Mostafa Babaeian
[J]. BUILDINGS, 2025, 15 (06)
[26] Large Language Models in der WissenschaftLarge language models in science
Karl-Friedrich Kowalewski
Severin Rodler
[J]. Die Urologie, 2024, 63 (9) : 860 - 866
[27] Large language models, politics, and the functionalization of language
Olya Kudina
Bas de Boer
[J]. AI and Ethics, 2025, 5 (3): : 2367 - 2379
[28] On the creativity of large language models
Franceschelli, Giorgio
Musolesi, Mirco
[J]. AI & SOCIETY, 2024, : 3785 - 3795
[29] Large language models and psychiatry
Orru, Graziella
Melis, Giulia
Sartori, Giuseppe
[J]. INTERNATIONAL JOURNAL OF LAW AND PSYCHIATRY, 2025, 101
[30] Large Language Models in Cyberattacks
S. V. Lebed
D. E. Namiot
E. V. Zubareva
P. V. Khenkin
A. A. Vorobeva
D. A. Svichkar
[J]. Doklady Mathematics, 2024, 110 (Suppl 2) : S510 - S520

← 1 2 3 4 5 →