Reading Wikipedia to Answer Open-Domain Questions

被引:840
作者
Chen, Danqi [1 ,2 ]
Fisch, Adam [2 ]
Weston, Jason [2 ]
Bordes, Antoine [2 ]
机构
[1] Stanford Univ, Comp Sci, Stanford, CA 94305 USA
[2] Facebook AI Res, 770 Broadway, New York, NY 10003 USA
来源
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1 | 2017年
关键词
D O I
10.18653/v1/P17-1171
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper proposes to tackle open-domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA datasets indicate that (1) both modules are highly competitive with respect to existing counterparts and (2) multitask learning using distant supervision on their combination is an effective complete system on this challenging task.
引用
收藏
页码:1870 / 1879
页数:10
相关论文
共 37 条
  • [1] Ahn David., 2004, P TREC 2004
  • [2] [Anonymous], 2016, ASS COMPUTATIONAL LI
  • [3] [Anonymous], 2015, Elasticsearch: the definitive guide: a distributed real-time search and analytics engine
  • [4] [Anonymous], 2016, arXiv
  • [5] [Anonymous], 2016, CoRR
  • [6] [Anonymous], 2016, ARXIV161204211
  • [7] [Anonymous], 2015, Large-scale simple question answering with memory networks
  • [8] [Anonymous], 2016, ARXIV161101436
  • [9] [Anonymous], 2016, CoRR
  • [10] DBpedia: A nucleus for a web of open data
    Auer, Soeren
    Bizer, Christian
    Kobilarov, Georgi
    Lehmann, Jens
    Cyganiak, Richard
    Ives, Zachary
    [J]. SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 722 - +