Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision

被引:51
作者
Bi, Xin [1 ,2 ]
Nie, Haojie [3 ]
Zhang, Guoliang [1 ,2 ]
Hu, Lei [1 ,2 ]
Ma, Yuliang [4 ]
Zhao, Xiangguo [5 ]
Yuan, Ye [6 ]
Wang, Guoren [6 ]
机构
[1] Northeastern Univ, Key Lab, Minist Educ Safe Min Deep Met Mines, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Key Lab Liaoning Prov Deep Engn & Intelligent Tech, Shenyang 110819, Peoples R China
[3] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
[4] Northeastern Univ, Sch Business Adm, Shenyang 110819, Peoples R China
[5] Northeastern Univ, Coll Software, Shenyang 110819, Peoples R China
[6] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Knowledge graph-based question answering; Multi-hop reasoning; Weak supervision; Augmented intelligence for decision-making;
D O I
10.1016/j.ipm.2022.103242
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Among existing knowledge graph based question answering (KGQA) methods, relation super-vision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.
引用
收藏
页数:17
相关论文
共 50 条
[1]   Automated Template Generation for Question Answering over Knowledge Graphs [J].
Abujabal, Abdalghani ;
Yahya, Mohamed ;
Riedewald, Mirek ;
Weikum, Gerhard .
PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, :1191-1200
[2]  
Ali S.S., 2020, ONLINE SOC NETW MEDI, V17
[3]  
Berant J., 2013, P 2013 C EMPIRICAL M, P1533
[4]   Unrestricted multi-hop reasoning network for interpretable question answering over knowledge graph [J].
Bi, Xin ;
Nie, Haojie ;
Zhang, Xiyu ;
Zhao, Xiangguo ;
Yuan, Ye ;
Wang, Guoren .
KNOWLEDGE-BASED SYSTEMS, 2022, 243
[5]  
Bollacker K. D., 2008, P ACM SIGMOD INT C M, P1247
[6]  
Bordes A., 2014, P 2014 C EMP METH NA, P615, DOI DOI 10.3115/V1/D14-1067
[7]  
Bordes A, 2013, NIPS 13, P2787, DOI DOI 10.5555/2999792.2999923
[8]  
Bosselut A, 2021, AAAI CONF ARTIF INTE, V35, P4923
[9]   Target-Aware Holistic Influence Maximization in Spatial Social Networks [J].
Cai, Taotao ;
Li, Jianxin ;
Mian, Ajmal S. ;
li, Ronghua ;
Sellis, Timos ;
Yu, Jeffrey Xu .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) :1993-2007
[10]  
Chen ZY, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P345