Large Language Models are Versatile Decomposers: Decomposing Evidence and Questions for Table-based Reasoning

被引:12
|
作者
Ye, Yunhu [1 ,4 ]
Hui, Binyuan [2 ]
Yang, Min [3 ]
Li, Binhua [2 ]
Huang, Fei [2 ]
Li, Yongbin [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] DAMO Acad, Alibaba Grp, Hangzhou, Peoples R China
[3] Chinese Acad Sci, SIAT, Shenzhen, Peoples R China
[4] Chinese Acad Sci, Shenzhen Inst Adv Technol SIAT, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023 | 2023年
关键词
Table-based reasoning; Large language models; Pre-trained language models;
D O I
10.1145/3539618.3591708
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Table-based reasoning has shown remarkable progress in a wide range of table-based tasks. It is a challenging task, which requires reasoning over both free-form natural language (NL) questions and (semi-)structured tabular data. However, previous table-based reasoning solutions usually suffer from significant performance degradation on "huge" evidence (tables). In addition, most existing methods struggle to reason over complex questions since the essential information is scattered in different places. To alleviate the above challenges, we exploit large language models (LLMs) as decomposers for effective table-based reasoning, which (i) decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information for table reasoning, and (ii) decompose a complex question into simpler sub-questions for text reasoning. First, we use a powerful LLM to decompose the evidence involved in the current question into the sub-evidence that retains the relevant information and excludes the remaining irrelevant information from the "huge" evidence. Second, we propose a novel "parsing-execution-filling" strategy to decompose a complex question into simper step-by-step sub-questions by generating intermediate SQL queries as a bridge to produce numerical and logical sub-questions with a powerful LLM. Finally, we leverage the decomposed sub-evidence and sub-questions to get the final answer with a few in-context prompting examples. Extensive experiments on three benchmark datasets (TabFact, WikiTableQuestion, and FetaQA) demonstrate that our method achieves significantly better results than competitive baselines for table-based reasoning. Notably, our method outperforms human performance for the first time on the TabFact dataset. In addition to impressive overall performance, our method also has the advantage of interpretability, where the returned results are to some extent tractable with the generated sub-evidence and sub-questions. For reproducibility, we release our source code and data at: https://github.com/AlibabaResearch/DAMO-ConvAI.
引用
收藏
页码:174 / 184
页数:11
相关论文
共 50 条
  • [21] A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
    Huang, Lei
    Yu, Weijiang
    Ma, Weitao
    Zhong, Weihong
    Feng, Zhangyin
    Wang, Haotian
    Chen, Qianglong
    Peng, Weihua
    Feng, Xiaocheng
    Qin, Bing
    Liu, Ting
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)
  • [22] Evaluating the reliability of the responses of large language models to keratoconus-related questions
    Kayabasi, Mustafa
    Koksaldi, Seher
    Engin, Ceren Durmaz
    CLINICAL AND EXPERIMENTAL OPTOMETRY, 2024,
  • [23] Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
    Sui, Yuan
    Zhou, Mengyu
    Zhou, Mingjie
    Han, Shi
    Zhang, Dongmei
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 645 - 654
  • [24] Large language model-based evolutionary optimizer: Reasoning with elitism
    Brahmachary, Shuvayan
    Joshi, Subodh M.
    Panda, Aniruddha
    Koneripalli, Kaushik
    Sagotra, Arun Kumar
    Patel, Harshil
    Sharma, Ankush
    Jagtap, Ameya D.
    Kalyanaraman, Kaushic
    NEUROCOMPUTING, 2025, 622
  • [25] CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models
    Frohberg, Jorg
    Binder, Frank
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2126 - 2140
  • [26] A Privacy Policy Text Compliance Reasoning Framework with Large Language Models for Healthcare Services
    Chen, Jintao
    Wang, Fan
    Pang, Shengye
    Chen, Mingshuai
    Xi, Meng
    Zhao, Tiancheng
    Yin, Jianwei
    TSINGHUA SCIENCE AND TECHNOLOGY, 2025, 30 (04): : 1831 - 1845
  • [27] Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models
    King, Evan
    Yu, Haoxiang
    Lee, Sangsu
    Julien, Christine
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2024, 8 (01):
  • [28] Large Language Models and the North American Pharmacist Licensure Examination (NAPLEX) Practice Questions
    Ehlert, Alexa
    Ehlert, Benjamin
    Cao, Binxin
    Morbitzer, Kathryn
    AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION, 2024, 88 (09)
  • [29] Distractor Generation for Multiple-Choice Questions with Predictive Prompting and Large Language Models
    Bitew, Semere Kiros
    Deleu, Johannes
    Develder, Chris
    Demeester, Thomas
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II, 2025, 2134 : 48 - 63
  • [30] Use and Application of Large Language Models for Patient Questions Following Total Knee Arthroplasty
    Bains, Sandeep S.
    Dubin, Jeremy A.
    Hameed, Daniel
    Sax, Oliver C.
    Douglas, Scott
    Mont, Michael A.
    Nace, James
    Delanois, Ronald E.
    JOURNAL OF ARTHROPLASTY, 2024, 39 (09) : 2289 - 2294