A study about the future evaluation of Question-Answering systems

被引:22
作者
Rodrigo, Alvaro [1 ]
Penas, Anselmo [1 ]
机构
[1] UNED, NLP & IR Grp, Juan Rosal 16, Madrid, Spain
关键词
Question Answering; Evaluation campaigns; Validation; Textual inference; TESTS;
D O I
10.1016/j.knosys.2017.09.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Evaluation campaigns of Question Answering (QA) systems have contributed to the development of such technologies. These campaigns have promoted some changes oriented to overcome results. However, at this period we see how systems have reached an upper bound, as well as systems are still far away from answering complex questions. In this paper, we overview the main QA evaluations over free text, paying special attention to the changes encouraged at such campaigns. We observe that systems still return a high proportion of incorrect answers and that the changes are almost not included in traditional approaches. Moreover, we analyze QA collections in order to obtain better insights about the main challenges for current QA systems. We detect that QA systems find very difficult to deal with different rewordings in questions and documents, as well as to infer information that is not explicitly mentioned in texts. Based on those observations, we recommend a set of directions for future evaluations, suggesting the application of textual inference and knowledge bases as a way for improving results. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:83 / 93
页数:11
相关论文
共 99 条
  • [1] Agichtein E., 2015, P 24 TEXT RETRIEVAL, P2015
  • [2] Agirre E., 2013, P MAIN C SHARED TASK, P32
  • [3] [Anonymous], 2008, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, DOI DOI 10.1145/1390334.1390417
  • [4] [Anonymous], 2009, WORKSHOP CROSS LANGU
  • [5] [Anonymous], 2005, TREC: Experiment and Evaluation in Information Retrieval. en. Digital Libraries and Electronic Publishing
  • [6] [Anonymous], 1950, COMPUTING MACHINERY
  • [7] [Anonymous], Natural Language Engineering, DOI DOI 10.1017/S1351324901002765
  • [8] Bentivogli Luisa, 2009, TAC
  • [9] Bosma W, 2007, LECT NOTES COMPUT SC, V4730, P502
  • [10] Breck E. J., 2000, P 2 INT C LANG RES E, P1495