Few-Shot Question Answering by Pretraining Span Selection

被引:0
作者
Ram, Ori [1 ]
Kirstain, Yuval [1 ]
Berant, Jonathan [1 ,2 ]
Globerson, Amir [1 ]
Levy, Omer [1 ]
机构
[1] Tel Aviv Univ, Blavatnik Sch Comp Sci, Tel Aviv, Israel
[2] Allen Inst AI, Seattle, WA USA
来源
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In several question answering benchmarks, pretrained models have reached human parity through fine-tuning on an order of 100,000 annotated questions and answers. We explore the more realistic few-shot setting, where only a few hundred training examples are available, and observe that standard models perform poorly, highlighting the discrepancy between current pretraining objectives and question answering. We propose a new pretraining scheme tailored for question answering: recurring span selection. Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span. Masked spans are replaced with a special token, viewed as a question representation, that is later used during fine-tuning to select the answer span. The resulting model obtains surprisingly good results on multiple benchmarks (e.g., 72.7 F1 on SQuAD with only 128 training examples), while maintaining competitive performance in the high-resource setting.(1)
引用
收藏
页码:3066 / 3079
页数:14
相关论文
共 50 条
  • [21] SymKGQA: Few-Shot Knowledge Graph Question Answering via Symbolic Program Generation and Execution
    Agarwal, Prerna
    Kumar, Nishant
    Bedathur, Srikanta
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 10119 - 10140
  • [22] Adaptive IMLE for Few-shot Pretraining-free Generative Modelling
    Aghabozorgi, Mehran
    Peng, Shichong
    Li, Ke
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 248 - 264
  • [23] Efficient Few-Shot Classification via Contrastive Pretraining on Web Data
    Li Z.
    Wang H.
    Swistek T.
    Yu E.
    Wang H.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 522 - 533
  • [24] Active Instance Selection for Few-Shot Classification
    Shin, Junsup
    Kang, Youngwook
    Jung, Seungjin
    Choi, Jongwon
    IEEE ACCESS, 2022, 10 : 133186 - 133195
  • [25] Pretraining Graph Neural Networks for Few-Shot Analog Circuit Modeling and Design
    Hakhamaneshi, Kourosh
    Nassar, Marcel
    Phielipp, Mariano
    Abbeel, Pieter
    Stojanovic, Vladimir
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (07) : 2163 - 2173
  • [26] Large Language Models for Binary Health-Related Question Answering: A Zero- and Few-Shot Evaluation
    Fernandez-Pichel, Marcos
    Losada, David E.
    Pichel, Juan C.
    COMPUTATIONAL SCIENCE, ICCS 2024, PT IV, 2024, 14835 : 325 - 339
  • [27] Span Selection Pre-training for Question Answering
    Glass, Michael
    Gliozzo, Alfio
    Chakravarti, Rishav
    Ferritto, Anthony
    Pan, Lin
    Bhargav, G. P. Shrivatsa
    Garg, Dinesh
    Sil, Avirup
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2773 - 2782
  • [28] Few-Shot Few-Shot Learning and the role of Spatial Attention
    Lifchitz, Yann
    Avrithis, Yannis
    Picard, Sylvaine
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2693 - 2700
  • [29] Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
    Coope, Sam
    Farghly, Tyler
    Gerz, Daniela
    Vulic, Ivan
    Henderson, Matthew
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 107 - 121
  • [30] Generating Question-Answer Pairs for Few-Shot Learning
    Wang, YuChen
    Li, Li
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT III, 2023, 14256 : 414 - 425