A Lightweight Method to Generate Unanswerable Questions in English

被引:0
作者
Gautam, Vagrant [1 ]
Zhang, Miaoran [1 ]
Klakow, Dietrich [1 ]
机构
[1] Saarland Univ, Saarland Informat Campus, Saarbrucken, Germany
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
If a question cannot be answered with the available information, robust systems for question answering (QA) should know not to answer. One way to build QA models that do this is with additional training data comprised of unanswerable questions, created either by employing annotators or through automated methods for unanswerable question generation. To show that the model complexity of existing automated approaches is not justified, we examine a simpler data augmentation method for unanswerable question generation in English: performing antonym and entity swaps on answerable questions. Compared to the prior state-of-the-art, data generated with our training-free and lightweight strategy results in better models (+1.6 F1 points on SQuAD 2.0 data with BERT-large), and has higher human-judged relatedness and readability. We quantify the raw benefits of our approach compared to no augmentation across multiple encoder models, using different amounts of generated data, and also on TydiQA-MinSpan data (+9.3 F1 points with BERT-large). Our results establish swaps as a simple but strong baseline for future work.
引用
收藏
页码:7349 / 7360
页数:12
相关论文
共 37 条
  • [1] Alberti C, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P6168
  • [2] [Anonymous], 2019, P AAAI C ART INT HON, DOI 10.1609/aaai.v33i01.33016529
  • [3] Bartolo M, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P8830
  • [4] Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension
    Bartolo, Max
    Roberts, Alastair
    Welbl, Johannes
    Riedel, Sebastian
    Stenetorp, Pontus
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (08) : 662 - 678
  • [5] Polar question particles: Hindi-Urdu kya:
    Bhatt, Rajesh
    Dayal, Veneeta
    [J]. NATURAL LANGUAGE & LINGUISTIC THEORY, 2020, 38 (04) : 1115 - 1144
  • [6] Bird S., 2009, Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit, V1st
  • [7] Clark C, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P845
  • [8] TYDI QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages
    Clark, Jonathan H.
    Choi, Eunsol
    Collins, Michael
    Garrette, Dan
    Kwiatkowski, Tom
    Nikolaev, Vitaly
    Palomaki, Jennimaria
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 454 - 470
  • [9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [10] Fellbaum C, 1998, LANG SPEECH & COMMUN, P1