Mitigating spatial hallucination in large language models for path planning via prompt engineering

被引:0
|
作者
Zhang, Hongjie [1 ]
Deng, Hourui [1 ]
Ou, Jie [2 ]
Feng, Chaosheng [1 ]
机构
[1] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Peoples R China
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
D O I
10.1038/s41598-025-93601-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Spatial reasoning in Large Language Models (LLMs) serves as a foundation for embodied intelligence. However, even in simple maze environments, LLMs often struggle to plan correct paths due to hallucination issues. To address this, we propose S2ERS, an LLM-based technique that integrates entity and relation extraction with the on-policy reinforcement learning algorithm Sarsa for optimal path planning. We introduce three key improvements: (1) To tackle the hallucination of spatial, we extract a graph structure of entities and relations from the text-based maze description, aiding LLMs in accurately comprehending spatial relationships. (2) To prevent LLMs from getting trapped in dead ends due to context inconsistency hallucination by long-term reasoning, we insert the state-action value function Q into the prompts, guiding the LLM's path planning. (3) To reduce the token consumption of LLMs, we utilize multi-step reasoning, dynamically inserting local Q-tables into the prompt to assist the LLM in outputting multiple steps of actions at once. Our comprehensive experimental evaluation, conducted using closed-source LLMs ChatGPT 3.5, ERNIE-Bot 4.0 and open-source LLM ChatGLM-6B, demonstrates that S2ERS significantly mitigates the spatial hallucination issues in LLMs, and improves the success rate and optimal rate by approximately 29% and 19%, respectively, in comparison to the SOTA CoT methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Investigating Hallucination Tendencies of Large Language Models in Japanese and English
    Tsuruta, Hiromi
    Sakaguchi, Rio
    Research Square,
  • [32] Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
    Strobelt H.
    Webson A.
    Sanh V.
    Hoover B.
    Beyer J.
    Pfister H.
    Rush A.M.
    IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1146 - 1156
  • [33] Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models
    Trad, Fouad
    Chehab, Ali
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 367 - 384
  • [34] The influence of prompt engineering on large language models for protein–protein interaction identification in biomedical literature
    Yung-Chun Chang
    Ming-Siang Huang
    Yi-Hsuan Huang
    Yi-Hsuan Lin
    Scientific Reports, 15 (1)
  • [35] HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
    Li, Junyi
    Cheng, Xiaoxue
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6449 - 6464
  • [36] Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation
    Wang, Xiaohua
    Yan, Yuliang
    Huang, Longtao
    Zheng, Xiaoqing
    Huang, Xuanjing
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15361 - 15371
  • [37] Locating and Mitigating Gender Bias in Large Language Models
    Cai, Yuchen
    Cao, Ding
    Guo, Rongxi
    Wen, Yaqin
    Liu, Guiquan
    Chen, Enhong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 471 - 482
  • [38] Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
    Ozdayi, Mustafa Safa
    Peris, Charith
    Fitzgerald, Jack
    Dupuy, Christophe
    Majmudar, Jimit
    Khan, Haidar
    Parikh, Rahil
    Gupta, Rahul
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1512 - 1521
  • [39] Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
    Chen, Yuyan
    Fu, Qiang
    Yuan, Yichen
    Wen, Zhihao
    Fan, Ge
    Liu, Dayiheng
    Zhang, Dongmei
    Li, Zhixu
    Xiao, Yanghua
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 245 - 255
  • [40] Hallucination Mitigation for Retrieval-Augmented Large Language Models: A Review
    Zhang, Wan
    Zhang, Jing
    MATHEMATICS, 2025, 13 (05)