SelQA: A New Benchmark for Selection-based Question Answering

被引:0
作者
Jurczyk, Tomasz [1 ]
Zhai, Michael [1 ]
Choi, Jinho D. [1 ]
机构
[1] Emory Univ, Math & Comp Sci, Atlanta, GA 30322 USA
来源
2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016) | 2016年
关键词
D O I
10.1109/ICTAI.2016.125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new selection-based question answering dataset, SelQA. The dataset consists of questions generated through crowdsourcing and sentence length answers that are drawn from the ten most prevalent topics in the English Wikipedia. We introduce a corpus annotation scheme that enhances the generation of large, diverse, and challenging datasets by explictly aiming to reduce word co-occurrences between the question and answers. Our annotation scheme is composed of a series of crowdsourcing tasks with a view to more effectively utilize crowdsourcing in the creation of question answering datasets in various domains. Several systems are compared on the tasks of answer sentence selection and answer triggering, providing strong baseline results for future work to improve upon.
引用
收藏
页码:820 / 827
页数:8
相关论文
共 24 条
  • [1] [Anonymous], 2013, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
  • [2] [Anonymous], 2010, P 23 INT C COMPUTATI
  • [3] [Anonymous], 2015, P 2015 C EMP METH NA
  • [4] [Anonymous], 2015, ARXIV151106038
  • [5] [Anonymous], 2016, P COLING 2016 26 INT
  • [6] [Anonymous], 2014, P 52 ANN M ASS COMP
  • [7] [Anonymous], 2015, P 2015 C EMPIRICAL M, DOI DOI 10.18653/V1/D15-1166
  • [8] [Anonymous], 2016, CORR
  • [9] [Anonymous], 2014, P C EMP METH NAT LAN
  • [10] Cho K., 2014, EMNLP, DOI DOI 10.3115/V1/D14-1179