Mitigating spatial hallucination in large language models for path planning via prompt engineering

被引：0

作者：

Zhang, Hongjie ^{[1
]}

Deng, Hourui ^{[1
]}

Ou, Jie ^{[2
]}

Feng, Chaosheng ^{[1
]}

机构：

[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China

来源：

SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期

关键词：

D O I：

10.1038/s41598-025-93601-5

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Spatial reasoning in Large Language Models (LLMs) serves as a foundation for embodied intelligence. However, even in simple maze environments, LLMs often struggle to plan correct paths due to hallucination issues. To address this, we propose S2ERS, an LLM-based technique that integrates entity and relation extraction with the on-policy reinforcement learning algorithm Sarsa for optimal path planning. We introduce three key improvements: (1) To tackle the hallucination of spatial, we extract a graph structure of entities and relations from the text-based maze description, aiding LLMs in accurately comprehending spatial relationships. (2) To prevent LLMs from getting trapped in dead ends due to context inconsistency hallucination by long-term reasoning, we insert the state-action value function Q into the prompts, guiding the LLM's path planning. (3) To reduce the token consumption of LLMs, we utilize multi-step reasoning, dynamically inserting local Q-tables into the prompt to assist the LLM in outputting multiple steps of actions at once. Our comprehensive experimental evaluation, conducted using closed-source LLMs ChatGPT 3.5, ERNIE-Bot 4.0 and open-source LLM ChatGLM-6B, demonstrates that S2ERS significantly mitigates the spatial hallucination issues in LLMs, and improves the success rate and optimal rate by approximately 29% and 19%, respectively, in comparison to the SOTA CoT methods.

引用

页数：13

共 50 条

[31] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
Tsuruta, Hiromi
Sakaguchi, Rio
Research Square,
[32] Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Strobelt H.
Webson A.
Sanh V.
Hoover B.
Beyer J.
Pfister H.
Rush A.M.
IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1146 - 1156
[33] Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models
Trad, Fouad
Chehab, Ali
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 367 - 384
[34] The influence of prompt engineering on large language models for protein–protein interaction identification in biomedical literature
Yung-Chun Chang
Ming-Siang Huang
Yi-Hsuan Huang
Yi-Hsuan Lin
Scientific Reports, 15 (1)
[35] HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Li, Junyi
Cheng, Xiaoxue
Zhao, Wayne Xin
Nie, Jian-Yun
Wen, Ji-Rong
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6449 - 6464
[36] Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation
Wang, Xiaohua
Yan, Yuliang
Huang, Longtao
Zheng, Xiaoqing
Huang, Xuanjing
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15361 - 15371
[37] Locating and Mitigating Gender Bias in Large Language Models
Cai, Yuchen
Cao, Ding
Guo, Rongxi
Wen, Yaqin
Liu, Guiquan
Chen, Enhong
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 471 - 482
[38] Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
Ozdayi, Mustafa Safa
Peris, Charith
Fitzgerald, Jack
Dupuy, Christophe
Majmudar, Jimit
Khan, Haidar
Parikh, Rahil
Gupta, Rahul
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1512 - 1521
[39] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Chen, Yuyan
Fu, Qiang
Yuan, Yichen
Wen, Zhihao
Fan, Ge
Liu, Dayiheng
Zhang, Dongmei
Li, Zhixu
Xiao, Yanghua
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 245 - 255
[40] Hallucination Mitigation for Retrieval-Augmented Large Language Models: A Review
Zhang, Wan
Zhang, Jing
MATHEMATICS, 2025, 13 (05)

← 1 2 3 4 5 →