Beyond IID: Three Levels of Generalization for Question Answering on Knowledge Bases

被引:105
作者
Gu, Yu [1 ]
Kase, Sue [2 ]
Vanni, Michelle T. [2 ]
Sadler, Brian M. [2 ]
Liang, Percy [3 ]
Yan, Xifeng [4 ]
Su, Yu [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] US Army Res Lab, Aberdeen Proving Ground, MD USA
[3] Stanford Univ, Stanford, CA 94305 USA
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
来源
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021) | 2021年
基金
美国国家科学基金会;
关键词
Knowledge Base; Question Answering; Semantic Parsing;
D O I
10.1145/3442381.3449992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing studies on question answering on knowledge bases (KBQA) mainly operate with the standard i.i.d. assumption, i.e., training distribution over questions is the same as the test distribution. However, i.i.d. may be neither achievable nor desirable on large-scale KBs because 1) true user distribution is hard to capture and 2) randomly sampling training examples from the enormous space would be data-inefficient. Instead, we suggest that KBQA models should have three levels of built-in generalization: i.i.d., compositional, and zero-shot. To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64,331 questions, GRAILQA, and provide evaluation settings for all three levels of generalization. In addition, we propose a novel BERT-based KBQA model. The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.(1)
引用
收藏
页码:3477 / 3488
页数:12
相关论文
共 47 条
[1]  
Abujabal A., 2017, WWW
[2]  
[Anonymous], 2013, ACL
[3]  
[Anonymous], ARXIV150602075
[4]  
[Anonymous], 2013, FACC1 FREEBASE ANNOT
[5]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[6]  
Bast Hannah, 2015, CIKM
[7]  
Berant Jonathan, 2013, P C EMP METH NAT LAN, P1533
[8]   Learning to Answer Complex Questions over Knowledge Bases with Query Composition [J].
Bhutani, Nikita ;
Zheng, Xinyi ;
Jagadish, H. V. .
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, :739-748
[9]   KBQA: Learning Question Answering over QA Corpora and Knowledge Bases [J].
Cui, Wanyun ;
Xiao, Yanghua ;
Wang, Haixun ;
Song, Yangqiu ;
Hwang, Seung-won ;
Wang, Wei .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (05) :565-576
[10]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171