A Metamorphic Testing Approach for Assessing Question Answering Systems

被引:6
|
作者
Tu, Kaiyi [1 ]
Jiang, Mingyue [1 ]
Ding, Zuohua [1 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Peoples R China
关键词
textual question answering; visual question answering; metamorphic testing; metamorphic relations; quality assessment;
D O I
10.3390/math9070726
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Question Answering (QA) enables the machine to understand and answer questions posed in natural language, which has emerged as a powerful tool in various domains. However, QA is a challenging task and there is an increasing concern about its quality. In this paper, we propose to apply the technique of metamorphic testing (MT) to evaluate QA systems from the users' perspectives, in order to help the users to better understand the capabilities of these systems and then to select appropriate QA systems for their specific needs. Two typical categories of QA systems, namely, the textual QA (TQA) and visual QA (VQA), are studied, and a total number of 17 metamorphic relations (MRs) are identified for them. These MRs respectively focus on some characteristics of different aspects of QA. We further apply MT to four QA systems (including two APIs from the AllenNLP platform, one API from the Transformers platform, and one API from CloudCV) by using all of the MRs. Our experimental results demonstrate the capabilities of the four subject QA systems from various aspects, revealing their strengths and weaknesses. These results further suggest that MT can be an effective method for assessing QA systems.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] METTLE: A METamorphic Testing Approach to Assessing and Validating Unsupervised Machine Learning Systems
    Xie, Xiaoyuan
    Zhang, Zhiyi
    Chen, Tsong Yueh
    Liu, Yang
    Poon, Pak-Lok
    Xu, Baowen
    IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (04) : 1293 - 1322
  • [2] A Hybrid Approach for Question Classification in Persian Automatic Question Answering Systems
    Sherkat, Ehsan
    Farhoodi, Mojgan
    2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 279 - 284
  • [3] AVA: an Automatic eValuation Approach for Question Answering Systems
    Vu, Thuy
    Moschitti, Alessandro
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5223 - 5233
  • [4] A Mighty Dataset for Stress-Testing Question Answering Systems
    Haarmann, Bastian
    Martens, Claudio
    Petzka, Henning
    Napolitano, Giulio
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 278 - 281
  • [5] Question/Answering Systems
    Visser, Ubbo
    KUNSTLICHE INTELLIGENZ, 2012, 26 (02): : 191 - 195
  • [6] QUESTION ANSWERING SYSTEMS
    Tomljanovic, Jasminka
    Krsnik, Marina
    Pavlic, Mile
    ZBORNIK VELEUCILISTA U RIJECI-JOURNAL OF THE POLYTECHNICS OF RIJEKA, 2014, 2 (01): : 177 - 195
  • [7] An Effective Approach for Relevant Paragraph Retrieval in Question Answering Systems
    Hoque, Md Moinul
    Quaresma, Paulo
    2015 18TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2015, : 44 - 49
  • [8] Testing the reasoning for question answering validation
    Penas, Anselmo
    Rodrigo, Alvaro
    Sama, Valentin
    Verdejo, Felisa
    JOURNAL OF LOGIC AND COMPUTATION, 2008, 18 (03) : 459 - 474
  • [9] A question-entailment approach to question answering
    Ben Abacha, Asma
    Demner-Fushman, Dina
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [10] A question-entailment approach to question answering
    Asma Ben Abacha
    Dina Demner-Fushman
    BMC Bioinformatics, 20