Coarse-to-Fine Question Answering for Long Documents

被引:73
作者
Choi, Eunsol [1 ,2 ]
Hewlett, Daniel [2 ]
Uszkoreit, Jakob [2 ]
Polosukhin, Illia [2 ,3 ]
Lacoste, Alexandre [2 ,4 ]
Berant, Jonathan [2 ,5 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Google, Mountain View, CA USA
[3] XIX Ai, San Francisco, CA USA
[4] Element AI, Montreal, PQ, Canada
[5] Tel Aviv Univ, Tel Aviv, Israel
来源
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1 | 2017年
基金
以色列科学基金会;
关键词
D O I
10.18653/v1/P17-1020
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models. While most successful approaches for reading comprehension rely on recurrent neural networks (RNNs), running them over long documents is prohibitively slow because it is difficult to parallelize over sequences. Inspired by how people first skim the document, identify relevant parts, and carefully read these parts to produce an answer, we combine a coarse, fast model for selecting relevant sentences and a more expensive RNN for producing the answer from those sentences. We treat sentence selection as a latent variable trained jointly from the answer only using reinforcement learning. Experiments demonstrate the state of the art performance on a challenging subset of the WIKIREADING dataset (Hewlett et al., 2016) and on a new dataset, while speeding up the model by 3.5x-6.7x.
引用
收藏
页码:209 / 220
页数:12
相关论文
共 46 条
[1]  
ABADI M, 2015, TENSORFLOW LARGE SCA, DOI DOI 10.48550/ARXIV.1605.08695
[2]  
[Anonymous], 2014, arXiv
[3]  
[Anonymous], 2016, P 2016 C N AM CHAPT, DOI DOI 10.18653/V1/N16-1181
[4]  
[Anonymous], 2016, arXiv
[5]  
[Anonymous], 2016, ARXIV161204211
[6]  
[Anonymous], 2014, NIPS DEEP LEARN WORK
[7]  
[Anonymous], 2016, ABS160203609 CORR
[8]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[9]  
Chen D, 2016, INT CONF SMART GRID
[10]  
Cheng Jianpeng, 2016, Long Papers