A Stacked BiLSTM Neural Network Based on Coattention Mechanism for Question Answering

被引:36
作者
Cai, Linqin [1 ]
Zhou, Sitong [1 ]
Yan, Xun [1 ]
Yuan, Rongdi [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Minist Educ, Key Lab Ind Internet Things & Networked Control, Chongqing 400065, Peoples R China
基金
国家重点研发计划;
关键词
ATTENTION; REPRESENTATION; RECOGNITION; SELECTION;
D O I
10.1155/2019/9543490
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Deep learning is the crucial technology in intelligent question answering research tasks. Nowadays, extensive studies on question answering have been conducted by adopting the methods of deep learning. The challenge is that it not only requires an effective semantic understanding model to generate a textual representation but also needs the consideration of semantic interaction between questions and answers simultaneously. In this paper, we propose a stacked Bidirectional Long Short-Term Memory (BiLSTM) neural network based on the coattention mechanism to extract the interaction between questions and answers, combining cosine similarity and Euclidean distance to score the question and answer sentences. Experiments are tested and evaluated on publicly available Text REtrieval Conference (TREC) 8-13 dataset and Wiki-QA dataset. Experimental results confirm that the proposed model is efficient and particularly it achieves a higher mean average precision (MAR) of 0.7613 and mean reciprocal rank (MRR) of 0.8401 on the TREC dataset.
引用
收藏
页数:12
相关论文
共 42 条
[1]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[2]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[3]   Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN [J].
Chen, Tao ;
Xu, Ruifeng ;
He, Yulan ;
Wang, Xuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 :221-230
[4]   Temporality-enhanced knowledge memory network for factoid question answering [J].
Duan, Xin-yu ;
Tang, Si-liang ;
Zhang, Sheng-yu ;
Zhang, Yin ;
Zhao, Zhou ;
Xue, Jian-ru ;
Zhuang, Yue-ting ;
Wu, Fei .
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (01) :104-115
[5]  
Feng MW, 2015, 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P813, DOI 10.1109/ASRU.2015.7404872
[6]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[7]   Visual Cortex Inspired CNN Model for Feature Construction in Text Analysis [J].
Fu, Hongping ;
Niu, Zhendong ;
Zhang, Chunxia ;
Ma, Jing ;
Chen, Jie .
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2016, 10
[8]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[9]   Cross-Modal Retrieval via Deep and Bidirectional Representation Learning [J].
He, Yonghao ;
Xiang, Shiming ;
Kang, Cuicui ;
Wang, Jian ;
Pan, Chunhong .
IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (07) :1363-1377
[10]  
Heilman M, 2010, HUMAN LANGUAGE TECHN, P1011, DOI DOI 10.18653/V1/N22-4401