Testing the reasoning for question answering validation

被引:11
|
作者
Penas, Anselmo [1 ]
Rodrigo, Alvaro [1 ]
Sama, Valentin [1 ]
Verdejo, Felisa [1 ]
机构
[1] Univ Nacl Educ Distancia, Depto Lenguajes & Sistemas Informat, Madrid 28040, Spain
关键词
textual entailment; test collections; question answering; answer validation; evaluation;
D O I
10.1093/logcom/exm072
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Question answering (QA) is a task that deserves more collaboration between natural language processing (NLP) and knowledge representation (KR) communities, not only to introduce reasoning when looking for answers or making use of answer type taxonomies and encyclopaedic knowledge, but also, as discussed here, for answer validation (AV), that is to say, to decide whether the responses of a QA system are correct or not. This was one of the motivations for the first Answer Validation Exercise at CLEF 2006 (AVE 2006). The starting point for the AVE 2006 was the reformulation of the answer validation as a recognizing textual entailment (RTE) problem, under the assumption that a hypothesis can be automatically generated instantiating a hypothesis pattern with a QA system answer. The test collections that we developed in seven different languages at AVE 2006 are specially oriented to the development and evaluation of answer validation systems. We show in this article the methodology followed for developing these collections taking advantage of the human assessments already made in the evaluation of QA systems. We also propose an evaluation framework for AV linked to a QA evaluation track. We quantify and discuss the source of errors introduced by the reformulation of the answer validation problem in terms of textual entailment (around 2, in the range of inter-annotator disagreement). We also show the evaluation results of the first answer validation exercise at CLEF 2006 where 11 groups have participated with 38 runs in seven different languages. The most extensively used techniques were Machine Learning and overlapping measures, but systems with broader knowledge resources and richer representation formalisms obtained the best results.
引用
收藏
页码:459 / 474
页数:16
相关论文
共 50 条
  • [31] Compositional Substitutivity of Visual Reasoning for Visual Question Answering
    Li, Chuanhao
    Li, Zhen
    Jing, Chenchen
    Wu, Yuwei
    Zhai, Mingliang
    Jia, Yunde
    COMPUTER VISION - ECCV 2024, PT XLVIII, 2025, 15106 : 143 - 160
  • [32] Relational reasoning and adaptive fusion for visual question answering
    Shen, Xiang
    Han, Dezhi
    Zong, Liang
    Guo, Zihan
    Hua, Jie
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5062 - 5080
  • [33] INTERPRETABLE VISUAL QUESTION ANSWERING VIA REASONING SUPERVISION
    Parelli, Maria
    Mallis, Dimitrios
    Diomataris, Markos
    Pitsikalis, Vassilis
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2525 - 2529
  • [34] MUREL: Multimodal Relational Reasoning for Visual Question Answering
    Cadene, Remi
    Ben-younes, Hedi
    Cord, Matthieu
    Thome, Nicolas
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1989 - 1998
  • [35] Maintaining Reasoning Consistency in Compositional Visual Question Answering
    Jing, Chenchen
    Jia, Yunde
    Wu, Yuwei
    Liu, Xinyu
    Wu, Qi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5089 - 5098
  • [36] Instance-sequence reasoning for video question answering
    Rui Liu
    Yahong Han
    Frontiers of Computer Science, 2022, 16
  • [37] A DIAGNOSTIC STUDY OF VISUAL QUESTION ANSWERING WITH ANALOGICAL REASONING
    Huang, Ziqi
    Zhu, Hongyuan
    Sun, Ying
    Choi, Dongkyu
    Tan, Cheston
    Lim, Joo-Hwee
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2463 - 2467
  • [38] Graph Reasoning Transformers for Knowledge -Aware Question Answering
    Zhao, Ruilin
    Zhao, Feng
    Hu, Liang
    Xu, Guandong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19652 - 19660
  • [39] Question Answering as Global Reasoning over Semantic Abstractions
    Khashabi, Daniel
    Khot, Tushar
    Sabharwal, Ashish
    Roth, Dan
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 1905 - 1914
  • [40] Reasoning with Heterogeneous Graph Alignment for Video Question Answering
    Jiang, Pin
    Han, Yahong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11109 - 11116