Large Language Models for Automated Open-domain Scientific Hypotheses Discovery

被引:0
作者
Yang, Zonglin [1 ]
Du, Xinya [2 ]
Li, Junxian [1 ]
Zheng, Jie [3 ]
Poria, Soujanya [4 ]
Cambria, Erik [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Univ Texas Dallas, Richardson, TX 75083 USA
[3] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[4] Singapore Univ Technol & Design, Singapore, Singapore
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Hypothetical induction is recognized as the main reasoning type when scientists make observations about the world and try to propose hypotheses to explain those observations. Past research on hypothetical induction is under a constrained setting: (1) the observation annotations in the dataset are carefully manually handpicked sentences (resulting in a close-domain setting); and (2) the ground truth hypotheses are mostly commonsense knowledge, making the task less challenging. In this work, we tackle these problems by proposing the first dataset for social science academic hypotheses discovery, with the final goal to create systems that automatically generate valid, novel, and helpful scientific hypotheses, given only a pile of raw web corpus. Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity. A multi-module framework is developed for the task, including three different feedback mechanisms to boost performance, which exhibits superior performance in terms of both GPT-4 based and expert-based evaluation. To the best of our knowledge, this is the first work showing that LLMs are able to generate novel ("not existing in literature") and valid ("reflecting reality") scientific hypotheses1.
引用
收藏
页码:13545 / 13565
页数:21
相关论文
共 9 条
  • [1] Biswas P, 2005, I CONF VLSI DESIGN, P651
  • [2] Das R, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P9594
  • [3] Liu Chia-Wei, 2016, P 2016 C EMP METH NA, P2122, DOI [10.18653/v1/D16-1230, DOI 10.18653/V1/D16-1230]
  • [4] Annotating and Learning Event Durations in Text
    Pan, Feng
    Mulkar-Mehta, Rutu
    Hobbs, Jerry R.
    [J]. COMPUTATIONAL LINGUISTICS, 2011, 37 (04) : 727 - 752
  • [5] BLEU: a method for automatic evaluation of machine translation
    Papineni, K
    Roukos, S
    Ward, T
    Zhu, WJ
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
  • [6] Design a robust controller for active queue management in large delay networks
    Ren, FY
    Lin, C
    Wei, B
    [J]. ISCC2004: NINTH INTERNATIONAL SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 748 - 754
  • [7] The probabilistic relevance framework: BM25 and beyond
    Robertson, Stephen
    Zaragoza, Hugo
    [J]. Foundations and Trends in Information Retrieval, 2009, 3 (04): : 333 - 389
  • [8] Scientific discovery in the age of artificial intelligence
    Wang, Hanchen
    Fu, Tianfan
    Du, Yuanqi
    Gao, Wenhao
    Huang, Kexin
    Liu, Ziming
    Chandak, Payal
    Liu, Shengchao
    Van Katwyk, Peter
    Deac, Andreea
    Anandkumar, Anima
    Bergen, Karianne
    Gomes, Carla P.
    Ho, Shirley
    Kohli, Pushmeet
    Lasenby, Joan
    Leskovec, Jure
    Liu, Tie-Yan
    Manrai, Arjun
    Marks, Debora
    Ramsundar, Bharath
    Song, Le
    Sun, Jimeng
    Tang, Jian
    Velickovic, Petar
    Welling, Max
    Zhang, Linfeng
    Coley, Connor W.
    Bengio, Yoshua
    Zitnik, Marinka
    [J]. NATURE, 2023, 620 (7972) : 47 - 60
  • [9] Yang ZL, 2023, 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, P3509