Multi-head attention based candidate segment selection in QA over hybrid data

被引:1
作者
Chen, Qian [1 ,2 ]
Gao, Xiaoying [3 ]
Guo, Xin [1 ]
Wang, Suge [1 ,2 ]
机构
[1] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan, Shanxi, Peoples R China
[2] Shanxi Univ, Minist Educ, Key Lab Computat Intelligence & Chinese Informat, Taiyuan, Shanxi, Peoples R China
[3] Tongji Univ, Dept Comp Sci & Technol, Shanghai, Peoples R China
关键词
Question answering on tabular and textual data; Wrong Evidence Ratio; Missing Evidence Ratio; multi-head attention;
D O I
10.3233/IDA-227032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering based on Tabular and Textual data is a novel task proposed in recent years in the field of QA. At present, most QA systems return answers from a single data form, such as knowledge graphs, tables, texts. However, hybrid data including structured and unstructured data is quite pervasive in real life instead of a single form. Recent research on TAT-QA mainly suffers from the higher error of extracting supporting evidences from both tabular and textual content. This paper aimed to address the problem of failure evidence extraction from more complex and realistic hybrid data. We first proposed two types of metrics to evaluate the performance of evidence extraction on hybrid data, i.e. wrong evidence ratio (WER) and missing evidence ratio (MER). Then we utilize a candidate extractor to obtain supporting evidence related to the question. Third, an origin selector is designed to determine from where the question's answer comes. Finally, the loss of origin selector is fused to the final loss function, which can improve the evidence extraction performance. Experimental results on the TAT-QA dataset showed that our proposed model outperforms the best baseline in terms of F1, WER and MER, which proves the effectiveness of our model.
引用
收藏
页码:1839 / 1852
页数:14
相关论文
共 33 条
[1]  
[Anonymous], 2016, Trans Assoc Comput Linguist, DOI DOI 10.1162/TACLA00097
[2]  
Chen WH, 2020, Arxiv, DOI arXiv:1909.02164
[3]  
Chen WH, 2021, Arxiv, DOI [arXiv:2010.10439, DOI 10.48550/ARXIV.2010.10439]
[4]  
Chen WH, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P1026
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]   Gated-Attention Readers for Text Comprehension [J].
Dhingra, Bhuwan ;
Liu, Hanxiao ;
Yang, Zhilin ;
Cohen, William W. ;
Salakhutdinov, Ruslan .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, :1832-1846
[7]  
Dua D, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P2368
[8]   A Corpus for Hybrid Question Answering Systems [J].
Grau, Brigitte ;
Ligozat, Anne-Laure .
COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, :1081-1086
[9]  
Hendrycks D, 2020, Arxiv, DOI arXiv:1606.08415
[10]  
Herzig J, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4320