Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning

被引:0
|
作者
Wang, Jinyuan [1 ,3 ]
Li, Junlong [2 ,3 ]
Zhao, Hai [2 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, SJTU Paris Elite Inst Technol, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023 | 2023年
基金
国家重点研发计划;
关键词
QUESTIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting. Recently, large language models (LLMs) have found significant utility in facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the reasoning capability of LLMs to a greater extent with manual or automated paradigms. However, existing automated methods lack of quality assurance, while manual approaches suffer from limited scalability and poor diversity, hindering the capabilities of LLMs. In this paper, we propose Self-prompted Chain-of-Thought (SP-CoT), an automated framework to mass-produce high quality CoTs of LLMs, by LLMs and for LLMs. SP-CoT introduces an automated generation pipeline of high quality ODMR datasets, an adaptive sampler for in-context CoT selection and self-prompted inference via in-context learning. Extensive experiments on four multihop question-answering benchmarks show that our proposed SP-CoT not only significantly surpasses the previous SOTA methods on largescale (175B) LLMs, but also nearly doubles the zero-shot performance of small-scale (13B) LLMs. Further analysis reveals the remarkable capability of SP-CoT to elicit direct and concise intermediate reasoning steps by recalling similar to 50% of intermediate answers on MuSiQueAns dataset.
引用
收藏
页码:2717 / 2731
页数:15
相关论文
共 37 条
  • [1] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Wei, Jason
    Wang, Xuezhi
    Schuurmans, Dale
    Bosma, Maarten
    Ichter, Brian
    Xia, Fei
    Chi, Ed H.
    Le, Quoc V.
    Zhou, Denny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] Chain-of-Thought Reasoning in Tabular Language Models
    Zheng, Mingyu
    Hao, Yang
    Jiang, Wenbin
    Lin, Zheng
    Lyu, Yajuan
    She, Qiaoqiao
    Wang, Weiping
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11006 - 11019
  • [3] Multi-Hop Paragraph Retrieval for Open-Domain Question Answering
    Feldman, Yair
    El-Yaniv, Ran
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2296 - 2309
  • [4] On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
    Nowak, Franz
    Svete, Anej
    Butoi, Alexandra
    Cotterell, Ryan
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 12510 - 12548
  • [5] Active Prompting with Chain-of-Thought for Large Language Models
    Diao, Shizhe
    Wang, Pengcheng
    Lin, Yong
    Pan, Rui
    Liu, Xiang
    Zhang, Tong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1330 - 1350
  • [6] Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models
    Vishwamitra, Nishant
    Guo, Keyan
    Romit, Farhan Tajwar
    Ondracek, Isabelle
    Cheng, Long
    Zhao, Ziming
    Hu, Hongxin
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 788 - 806
  • [7] Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
    He, Liqi
    Li, Zuchao
    Cai, Xiantao
    Wang, Ping
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18180 - 18187
  • [8] Performance evaluation of large language models with chain-of-thought reasoning ability in clinical laboratory case interpretation
    Yang, He S.
    Li, Jieli
    Yi, Xin
    Wang, Fei
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2025,
  • [9] ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models
    Chen, Zhipeng
    Zhou, Kun
    Zhang, Beichen
    Gong, Zheng
    Zhao, Wayne Xin
    Wen, Ji-Rong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14777 - 14790
  • [10] Improving intermediate reasoning in zero-shot chain-of-thought for large language models with filter supervisor-self correction
    Sun, Jun
    Pan, Yiteng
    Yan, Xiaohu
    NEUROCOMPUTING, 2025, 620