Language-based reasoning graph neural network for commonsense question answering

被引:0
作者
Yang, Meng [1 ]
Wang, Yihao [1 ,2 ]
Gu, Yu [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] SYSU, Minist Educ, Key Lab Machine Intelligence & Adv Comp, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Commonsense QA; Language-based reasoning; External knowledge;
D O I
10.1016/j.neunet.2024.106816
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language model (LM) has played an increasingly important role in the common-sense understanding and reasoning in the CSQA task (Common Sense Question Answering). However, due to the amount of model parameters, increasing training data helps little in further improving model performance. Introducing external knowledge through graph neural networks (GNNs) proves positive in boosting performance, but exploiting different knowledge sources and capturing contextual information between text and knowledge inside remains a challenge. In this paper, we propose LBR-GNN, a L anguage-Based R easoning G raph N eural N etwork method to address these problems, by representing the question with each answer and external knowledge using a language model and predicting the reasoning score with a designed language-based GNN. Our LBR-GNN will first regulate external knowledge into a consistent textual form and encode it using a standard LM to capture the contextual information. Then, we build a graph neural network using the encoded information, especially the language-level edge representation. Finally, we design a novel edge aggregation method to select the edge information for GNN update and the language-guided GNN reasoning. We assess the performance of LBRGNN across the CommonsenseQA, CommonsenseQA-IH, and OpenBookQA datasets. Our evaluation reveals a performance boost of more than 5% compared to the state-of-the-art methods on the CSQA dataset, achieved with a similar number of additional parameters.
引用
收藏
页数:13
相关论文
共 57 条
[1]  
Aggarwal P, 2023, 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), P12375
[2]  
Bengio Y, 2001, ADV NEUR IN, V13, P932
[3]  
Brown TB, 2020, ADV NEUR IN, V33
[4]  
Chowdhery A, 2022, Arxiv, DOI [arXiv:2204.02311, DOI 10.48550/ARXIV.2204.02311, 10.48550/arXiv.2204.02311]
[5]  
Clark P, 2020, AI MAG, V41, P39
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Feng YL, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P1295
[8]  
Geva M., 2019, BERT-large "whole word masking"model on the open mind common sense (OMCS) corpus
[9]  
He P., 2021, arXiv, DOI DOI 10.48550/ARXIV.2111.09543
[10]   MV-GNN: Multi-View Graph Neural Network for Compression Artifacts Reduction [J].
He, Xin ;
Liu, Qiong ;
Yang, You .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :6829-6840