Pointing the Unknown Words

被引:153
作者
Gulcehre, Caglar [1 ]
Ahn, Sungjin [1 ]
Nallapati, Ramesh [2 ]
Zhou, Bowen [2 ]
Bengio, Yoshua [1 ]
机构
[1] Univ Montreal, Montreal, PQ, Canada
[2] IBM TJ Watson Res, Yorktown Hts, NY USA
来源
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1 | 2016年
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.18653/v1/p16-1014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of rare and unknown words is an important issue that can potentially effect the performance of many NLP systems, including traditional count-based and deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each timestep, the decision of which softmax layer to use is adaptively made by an MLP which is conditioned on the context. We motivate this work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. Using our proposed model, we observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset.
引用
收藏
页码:140 / 149
页数:10
相关论文
共 28 条
  • [1] [Anonymous], NEURAL NETWORKS IEEE
  • [2] [Anonymous], 2014, Generating sequences with recurrent neural networks
  • [3] [Anonymous], 2015, International Conference on Learning Representations
  • [4] [Anonymous], 2016, ARXIV
  • [5] [Anonymous], 2015, P ICLR
  • [6] [Anonymous], ARXIV160300391
  • [7] [Anonymous], ARXIV12125701 ARXIV
  • [8] [Anonymous], 2015, P ACL
  • [9] [Anonymous], ARXIV1506 02075
  • [10] [Anonymous], 2015, ARXIV151200103