Testing the reasoning for question answering validation

被引:11
|
作者
Penas, Anselmo [1 ]
Rodrigo, Alvaro [1 ]
Sama, Valentin [1 ]
Verdejo, Felisa [1 ]
机构
[1] Univ Nacl Educ Distancia, Depto Lenguajes & Sistemas Informat, Madrid 28040, Spain
关键词
textual entailment; test collections; question answering; answer validation; evaluation;
D O I
10.1093/logcom/exm072
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Question answering (QA) is a task that deserves more collaboration between natural language processing (NLP) and knowledge representation (KR) communities, not only to introduce reasoning when looking for answers or making use of answer type taxonomies and encyclopaedic knowledge, but also, as discussed here, for answer validation (AV), that is to say, to decide whether the responses of a QA system are correct or not. This was one of the motivations for the first Answer Validation Exercise at CLEF 2006 (AVE 2006). The starting point for the AVE 2006 was the reformulation of the answer validation as a recognizing textual entailment (RTE) problem, under the assumption that a hypothesis can be automatically generated instantiating a hypothesis pattern with a QA system answer. The test collections that we developed in seven different languages at AVE 2006 are specially oriented to the development and evaluation of answer validation systems. We show in this article the methodology followed for developing these collections taking advantage of the human assessments already made in the evaluation of QA systems. We also propose an evaluation framework for AV linked to a QA evaluation track. We quantify and discuss the source of errors introduced by the reformulation of the answer validation problem in terms of textual entailment (around 2, in the range of inter-annotator disagreement). We also show the evaluation results of the first answer validation exercise at CLEF 2006 where 11 groups have participated with 38 runs in seven different languages. The most extensively used techniques were Machine Learning and overlapping measures, but systems with broader knowledge resources and richer representation formalisms obtained the best results.
引用
收藏
页码:459 / 474
页数:16
相关论文
共 50 条
  • [21] Operation-Augmented Numerical Reasoning for Question Answering
    Zhou, Yongwei
    Bao, Junwei
    Wu, Youzheng
    He, Xiaodong
    Zhao, Tiejun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 15 - 28
  • [22] Medical Visual Question Answering via Conditional Reasoning
    Zhan, Li-Ming
    Liu, Bo
    Fan, Lu
    Chen, Jiaxin
    Wu, Xiao-Ming
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2345 - 2354
  • [23] Neural Reasoning, Fast and Slow, for Video Question Answering
    Thao Minh Le
    Vuong Le
    Venkatesh, Svetha
    Truyen Tran
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [24] Interpretable Visual Question Answering by Reasoning on Dependency Trees
    Cao, Qingxing
    Liang, Xiaodan
    Li, Bailin
    Lin, Liang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 887 - 901
  • [25] Video Question Answering with Spatio-Temporal Reasoning
    Jang, Yunseok
    Song, Yale
    Kim, Chris Dongjoo
    Yu, Youngjae
    Kim, Youngjin
    Kim, Gunhee
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (10) : 1385 - 1412
  • [26] PathReasoner: Explainable reasoning paths for commonsense question answering
    Zhan, Xunlin
    Huang, Yinya
    Dong, Xiao
    Cao, Qingxing
    Liang, Xiaodan
    KNOWLEDGE-BASED SYSTEMS, 2022, 235
  • [27] Instance-sequence reasoning for video question answering
    LIU Rui
    HAN Yahong
    Frontiers of Computer Science, 2022, 16 (06)
  • [28] Multimodal Knowledge Reasoning for Enhanced Visual Question Answering
    Hussain, Afzaal
    Maqsood, Ifrah
    Shahzad, Muhammad
    Fraz, Muhammad Moazam
    2022 16TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS, SITIS, 2022, : 224 - 230
  • [29] Instance-sequence reasoning for video question answering
    Liu, Rui
    Han, Yahong
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (06)
  • [30] Reasoning with large language models for medical question answering
    Lucas, Mary M.
    Yang, Justin
    Pomeroy, Jon K.
    Yang, Christopher C.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)