Large Language Models for Automated Open-domain Scientific Hypotheses Discovery

被引：0

作者：

Yang, Zonglin ^{[1
]}

Du, Xinya ^{[2
]}

Li, Junxian ^{[1
]}

Zheng, Jie ^{[3
]}

Poria, Soujanya ^{[4
]}

Cambria, Erik ^{[1
]}

机构：

[1] Nanyang Technol Univ, Singapore, Singapore

[2] Univ Texas Dallas, Richardson, TX 75083 USA

[3] Huazhong Univ Sci & Technol, Wuhan, Peoples R China

[4] Singapore Univ Technol & Design, Singapore, Singapore

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Hypothetical induction is recognized as the main reasoning type when scientists make observations about the world and try to propose hypotheses to explain those observations. Past research on hypothetical induction is under a constrained setting: (1) the observation annotations in the dataset are carefully manually handpicked sentences (resulting in a close-domain setting); and (2) the ground truth hypotheses are mostly commonsense knowledge, making the task less challenging. In this work, we tackle these problems by proposing the first dataset for social science academic hypotheses discovery, with the final goal to create systems that automatically generate valid, novel, and helpful scientific hypotheses, given only a pile of raw web corpus. Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity. A multi-module framework is developed for the task, including three different feedback mechanisms to boost performance, which exhibits superior performance in terms of both GPT-4 based and expert-based evaluation. To the best of our knowledge, this is the first work showing that LLMs are able to generate novel ("not existing in literature") and valid ("reflecting reality") scientific hypotheses1.

引用

页码：13545 / 13565

页数：21

共 9 条

[1] Biswas P, 2005, I CONF VLSI DESIGN, P651
[2] Das R, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P9594
[3] Liu Chia-Wei, 2016, P 2016 C EMP METH NA, P2122, DOI [10.18653/v1/D16-1230, DOI 10.18653/V1/D16-1230]
[4] Annotating and Learning Event Durations in Text
Pan, Feng
Mulkar-Mehta, Rutu
Hobbs, Jerry R.
[J]. COMPUTATIONAL LINGUISTICS, 2011, 37 (04) : 727 - 752
[5] BLEU: a method for automatic evaluation of machine translation
Papineni, K
Roukos, S
Ward, T
Zhu, WJ
[J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
[6] Design a robust controller for active queue management in large delay networks
Ren, FY
Lin, C
Wei, B
[J]. ISCC2004: NINTH INTERNATIONAL SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 748 - 754
[7] The probabilistic relevance framework: BM25 and beyond
Robertson, Stephen
Zaragoza, Hugo
[J]. Foundations and Trends in Information Retrieval, 2009, 3 (04): : 333 - 389
[8] Scientific discovery in the age of artificial intelligence
Wang, Hanchen
Fu, Tianfan
Du, Yuanqi
Gao, Wenhao
Huang, Kexin
Liu, Ziming
Chandak, Payal
Liu, Shengchao
Van Katwyk, Peter
Deac, Andreea
Anandkumar, Anima
Bergen, Karianne
Gomes, Carla P.
Ho, Shirley
Kohli, Pushmeet
Lasenby, Joan
Leskovec, Jure
Liu, Tie-Yan
Manrai, Arjun
Marks, Debora
Ramsundar, Bharath
Song, Le
Sun, Jimeng
Tang, Jian
Velickovic, Petar
Welling, Max
Zhang, Linfeng
Coley, Connor W.
Bengio, Yoshua
Zitnik, Marinka
[J]. NATURE, 2023, 620 (7972) : 47 - 60
[9] Yang ZL, 2023, 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, P3509

← 1 →