Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning

被引：0

作者：

Wang, Jinyuan ^{[1
,3
]}

Li, Junlong ^{[2
,3
]}

Zhao, Hai ^{[2
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, SJTU Paris Elite Inst Technol, Shanghai, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China

[3] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023 | 2023年

基金：

国家重点研发计划;

关键词：

QUESTIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting. Recently, large language models (LLMs) have found significant utility in facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the reasoning capability of LLMs to a greater extent with manual or automated paradigms. However, existing automated methods lack of quality assurance, while manual approaches suffer from limited scalability and poor diversity, hindering the capabilities of LLMs. In this paper, we propose Self-prompted Chain-of-Thought (SP-CoT), an automated framework to mass-produce high quality CoTs of LLMs, by LLMs and for LLMs. SP-CoT introduces an automated generation pipeline of high quality ODMR datasets, an adaptive sampler for in-context CoT selection and self-prompted inference via in-context learning. Extensive experiments on four multihop question-answering benchmarks show that our proposed SP-CoT not only significantly surpasses the previous SOTA methods on largescale (175B) LLMs, but also nearly doubles the zero-shot performance of small-scale (13B) LLMs. Further analysis reveals the remarkable capability of SP-CoT to elicit direct and concise intermediate reasoning steps by recalling similar to 50% of intermediate answers on MuSiQueAns dataset.

引用

页码：2717 / 2731

页数：15

共 37 条

[1] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Wei, Jason
Wang, Xuezhi
Schuurmans, Dale
Bosma, Maarten
Ichter, Brian
Xia, Fei
Chi, Ed H.
Le, Quoc V.
Zhou, Denny
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[2] Chain-of-Thought Reasoning in Tabular Language Models
Zheng, Mingyu
Hao, Yang
Jiang, Wenbin
Lin, Zheng
Lyu, Yajuan
She, Qiaoqiao
Wang, Weiping
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11006 - 11019
[3] Multi-Hop Paragraph Retrieval for Open-Domain Question Answering
Feldman, Yair
El-Yaniv, Ran
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2296 - 2309
[4] On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
Nowak, Franz
Svete, Anej
Butoi, Alexandra
Cotterell, Ryan
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 12510 - 12548
[5] Active Prompting with Chain-of-Thought for Large Language Models
Diao, Shizhe
Wang, Pengcheng
Lin, Yong
Pan, Rui
Liu, Xiang
Zhang, Tong
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1330 - 1350
[6] Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models
Vishwamitra, Nishant
Guo, Keyan
Romit, Farhan Tajwar
Ondracek, Isabelle
Cheng, Long
Zhao, Ziming
Hu, Hongxin
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 788 - 806
[7] Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
He, Liqi
Li, Zuchao
Cai, Xiantao
Wang, Ping
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18180 - 18187
[8] Performance evaluation of large language models with chain-of-thought reasoning ability in clinical laboratory case interpretation
Yang, He S.
Li, Jieli
Yi, Xin
Wang, Fei
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2025,
[9] ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models
Chen, Zhipeng
Zhou, Kun
Zhang, Beichen
Gong, Zheng
Zhao, Wayne Xin
Wen, Ji-Rong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14777 - 14790
[10] Improving intermediate reasoning in zero-shot chain-of-thought for large language models with filter supervisor-self correction
Sun, Jun
Pan, Yiteng
Yan, Xiaohu
NEUROCOMPUTING, 2025, 620

← 1 2 3 4 →