Testing the reasoning for question answering validation

被引:11
|
作者
Penas, Anselmo [1 ]
Rodrigo, Alvaro [1 ]
Sama, Valentin [1 ]
Verdejo, Felisa [1 ]
机构
[1] Univ Nacl Educ Distancia, Depto Lenguajes & Sistemas Informat, Madrid 28040, Spain
关键词
textual entailment; test collections; question answering; answer validation; evaluation;
D O I
10.1093/logcom/exm072
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Question answering (QA) is a task that deserves more collaboration between natural language processing (NLP) and knowledge representation (KR) communities, not only to introduce reasoning when looking for answers or making use of answer type taxonomies and encyclopaedic knowledge, but also, as discussed here, for answer validation (AV), that is to say, to decide whether the responses of a QA system are correct or not. This was one of the motivations for the first Answer Validation Exercise at CLEF 2006 (AVE 2006). The starting point for the AVE 2006 was the reformulation of the answer validation as a recognizing textual entailment (RTE) problem, under the assumption that a hypothesis can be automatically generated instantiating a hypothesis pattern with a QA system answer. The test collections that we developed in seven different languages at AVE 2006 are specially oriented to the development and evaluation of answer validation systems. We show in this article the methodology followed for developing these collections taking advantage of the human assessments already made in the evaluation of QA systems. We also propose an evaluation framework for AV linked to a QA evaluation track. We quantify and discuss the source of errors introduced by the reformulation of the answer validation problem in terms of textual entailment (around 2, in the range of inter-annotator disagreement). We also show the evaluation results of the first answer validation exercise at CLEF 2006 where 11 groups have participated with 38 runs in seven different languages. The most extensively used techniques were Machine Learning and overlapping measures, but systems with broader knowledge resources and richer representation formalisms obtained the best results.
引用
收藏
页码:459 / 474
页数:16
相关论文
共 50 条
  • [41] PRIOR VISUAL RELATIONSHIP REASONING FOR VISUAL QUESTION ANSWERING
    Yang, Zhuoqian
    Qin, Zengchang
    Yu, Jing
    Wan, Tao
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1411 - 1415
  • [42] Combining Natural Logic and Shallow Reasoning for Question Answering
    Angeli, Gabor
    Nayak, Neha
    Manning, Christopher D.
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 442 - 452
  • [43] Analogical Reasoning for Answer Ranking in Social Question Answering
    Tu, Xudong
    Feng, Dan
    Wang, Xin-Jing
    Zhang, Lei
    IEEE INTELLIGENT SYSTEMS, 2012, 27 (05) : 28 - 35
  • [44] PathReasoner: Explainable reasoning paths for commonsense question answering
    Zhan, Xunlin
    Huang, Yinya
    Dong, Xiao
    Cao, Qingxing
    Liang, Xiaodan
    Knowledge-Based Systems, 2022, 235
  • [45] Video Question Answering with Spatio-Temporal Reasoning
    Yunseok Jang
    Yale Song
    Chris Dongjoo Kim
    Youngjae Yu
    Youngjin Kim
    Gunhee Kim
    International Journal of Computer Vision, 2019, 127 : 1385 - 1412
  • [46] SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning
    Mirzaee, Roshanak
    Faghihi, Hossein Rajaby
    Ning, Qiang
    Kordjamshidi, Parisa
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4582 - 4598
  • [47] Approaching Question Answering by Means of Paragraph Validation
    Rodrigo, Alvaro
    Perez-Iglesias, Joaquin
    Penas, Anselmo
    Garrido, Guillermo
    Araujo, Lourdes
    MULTILINGUAL INFORMATION ACCESS EVALUATION I: TEXT RETRIEVAL EXPERIMENTS, 2010, 6241 : 245 - 252
  • [48] Towards an automatic validation of answers in Question answering
    Ligozat, Anne-Laure
    Grau, Brigitte
    Vilnat, Anne
    Robba, Isabelle
    Grappy, Arnaud
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 444 - 447
  • [49] Evaluating question answering validation as a classification problem
    Álvaro Rodrigo
    Anselmo Peñas
    Felisa Verdejo
    Language Resources and Evaluation, 2012, 46 : 493 - 501
  • [50] Comparing Approaches for Evaluating Question Answering Validation
    Rodrigo, Alvaro
    Penas, Anselmo
    Verdejo, Felisa
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2009, (43): : 277 - 285