CAM: A Combined Attention Model for Natural Language Inference

被引:0
作者
Gajbhiye, Amit [1 ]
Jaf, Sardar [1 ]
Al Moubayed, Noura [1 ]
Bradley, Steven [1 ]
McGough, A. Stephen [2 ]
机构
[1] Univ Durham, Dept Comp Sci, Durham, England
[2] Newcastle Univ, Sch Comp, Newcastle Upon Tyne, Tyne & Wear, England
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年
关键词
Natural Language Inference; Textual Entailment; Deep Learning; Attention Mechanism; SNLI dataset; SciTail dataset;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Inference (NLI) is a fundamental step towards natural language understanding. The task aims to detect whether a premise entails or contradicts a given hypothesis. NLI contributes to a wide range of natural language understanding applications such as question answering, text summarization and information extraction. Recently, the public availability of big datasets such as Stanford Natural Language Inference (SNLI) and SciTail, has made it feasible to train complex neural NLI models. Particularly, Bidirectional Long Short-Term Memory networks (BiLSTMs) with attention mechanisms have shown promising performance for NLI. In this paper, we propose a Combined Attention Model (CAM) for NLI. CAM combines the two attention mechanisms: intra-attention and inter-attention. The model first captures the semantics of the individual input premise and hypothesis with intra-attention and then aligns the premise and hypothesis with inter-sentence attention. We evaluate CAM on two benchmark datasets: Stanford Natural Language Inference (SNLI) and SciTail, achieving 86.14% accuracy on SNLI and 77.23% on SciTail. Further, to investigate the effectiveness of individual attention mechanism and in combination with each other, we present an analysis showing that the intra-and inter-attention mechanisms achieve higher accuracy when they are combined together than when they are independently used.
引用
收藏
页码:1009 / 1014
页数:6
相关论文
共 24 条
  • [1] [Anonymous], 2015, P 2015 C EMPIRICAL M, DOI DOI 10.18653/V1/D15-1166
  • [2] [Anonymous], 2016, CoRR
  • [3] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [4] Bowman S., 2015, P 2015 C EMPIRICAL M, P632, DOI DOI 10.18653/V1/D15
  • [5] Bowman SR, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1466
  • [6] Chen JZ, 2016, PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), P551, DOI [10.1109/CIS.2016.133, 10.1109/CIS.2016.0134]
  • [7] Enhanced LSTM for Natural Language Inference
    Chen, Qian
    Zhu, Xiaodan
    Ling, Zhenhua
    Wei, Si
    Jiang, Hui
    Inkpen, Diana
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1657 - 1668
  • [8] Choi Jihun, 2017, Learning to compose task-specific tree structures
  • [9] An Exploration of Dropout with RNNs for Natural Language Inference
    Gajbhiye, Amit
    Jaf, Sardar
    Al Moubayed, Noura
    McGough, A. Stephen
    Bradley, Steven
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 157 - 167
  • [10] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]