Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture

被引:69
作者
Tay, Yi [1 ]
Phan, Minh C. [1 ]
Luu Anh Tuan [2 ]
Hui, Siu Cheung [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Inst Infocomm Res, Singapore, Singapore
来源
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2017年
关键词
Deep Learning; Long Short-Term Memory; Learning to Rank; Question Answering;
D O I
10.1145/3077136.3080790
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a new deep learning architecture for learning to rank question answer pairs. Our approach extends the long short-term memory (LSTM) network with holographic composition to model the relationship between question and answer representations. As opposed to the neural tensor layer that has been adopted recently, the holographic composition provides the benefits of scalable and rich representational learning approach without incurring huge parameter costs. Overall, we present Holographic Dual LSTM (HD-LSTM), a unified architecture for both deep sentence modeling and semantic matching. Essentially, our model is trained end-to-end whereby the parameters of the LSTM are optimized in a way that best explains the correlation between question and answer representations. In addition, our proposed deep learning architecture requires no extensive feature engineering. Via extensive experiments, we show that HD-LSTM outperforms many other neural architectures on two popular benchmark QA datasets. Empirical studies confirm the effectiveness of holographic composition over the neural tensor layer.
引用
收藏
页码:695 / 704
页数:10
相关论文
共 36 条
  • [1] [Anonymous], 2010, P 23 INT C COMPUTATI
  • [2] [Anonymous], 2016, CoRR
  • [3] [Anonymous], 2008, P 31 ANN INT ACM SIG
  • [4] [Anonymous], HUMAN LANGUAGE TECHN
  • [5] Berger A., 2000, SIGIR Forum, V34, P192
  • [6] Bordes A., 2014, JOINT EUROPEAN C MAC
  • [7] ASSOCIATIVE HOLOGRAPHIC MEMORIES
    GABOR, D
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1969, 13 (02) : 156 - &
  • [8] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [9] Hu BT, 2014, ADV NEUR IN, V27
  • [10] Hua He, MULTIPERSPECTIVE SEN