A study about the future evaluation of Question-Answering systems

被引：22

作者：

Rodrigo, Alvaro ^{[1
]}

Penas, Anselmo ^{[1
]}

机构：

[1] UNED, NLP & IR Grp, Juan Rosal 16, Madrid, Spain

来源：

KNOWLEDGE-BASED SYSTEMS | 2017年 / 137卷

关键词：

Question Answering; Evaluation campaigns; Validation; Textual inference; TESTS;

D O I：

10.1016/j.knosys.2017.09.015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Evaluation campaigns of Question Answering (QA) systems have contributed to the development of such technologies. These campaigns have promoted some changes oriented to overcome results. However, at this period we see how systems have reached an upper bound, as well as systems are still far away from answering complex questions. In this paper, we overview the main QA evaluations over free text, paying special attention to the changes encouraged at such campaigns. We observe that systems still return a high proportion of incorrect answers and that the changes are almost not included in traditional approaches. Moreover, we analyze QA collections in order to obtain better insights about the main challenges for current QA systems. We detect that QA systems find very difficult to deal with different rewordings in questions and documents, as well as to infer information that is not explicitly mentioned in texts. Based on those observations, we recommend a set of directions for future evaluations, suggesting the application of textual inference and knowledge bases as a way for improving results. (C) 2017 Elsevier B.V. All rights reserved.

引用

页码：83 / 93

页数：11

共 99 条

[1] Agichtein E., 2015, P 24 TEXT RETRIEVAL, P2015
[2] Agirre E., 2013, P MAIN C SHARED TASK, P32
[3] [Anonymous], 2008, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, DOI DOI 10.1145/1390334.1390417
[4] [Anonymous], 2009, WORKSHOP CROSS LANGU
[5] [Anonymous], 2005, TREC: Experiment and Evaluation in Information Retrieval. en. Digital Libraries and Electronic Publishing
[6] [Anonymous], 1950, COMPUTING MACHINERY
[7] [Anonymous], Natural Language Engineering, DOI DOI 10.1017/S1351324901002765
[8] Bentivogli Luisa, 2009, TAC
[9] Bosma W, 2007, LECT NOTES COMPUT SC, V4730, P502
[10] Breck E. J., 2000, P 2 INT C LANG RES E, P1495

← 1 2 3 4 5 6 7 8 9 10 →