Few-Shot Question Answering by Pretraining Span Selection

被引:0
作者
Ram, Ori [1 ]
Kirstain, Yuval [1 ]
Berant, Jonathan [1 ,2 ]
Globerson, Amir [1 ]
Levy, Omer [1 ]
机构
[1] Tel Aviv Univ, Blavatnik Sch Comp Sci, Tel Aviv, Israel
[2] Allen Inst AI, Seattle, WA USA
来源
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In several question answering benchmarks, pretrained models have reached human parity through fine-tuning on an order of 100,000 annotated questions and answers. We explore the more realistic few-shot setting, where only a few hundred training examples are available, and observe that standard models perform poorly, highlighting the discrepancy between current pretraining objectives and question answering. We propose a new pretraining scheme tailored for question answering: recurring span selection. Given a passage with multiple sets of recurring spans, we mask in each set all recurring spans but one, and ask the model to select the correct span in the passage for each masked span. Masked spans are replaced with a special token, viewed as a question representation, that is later used during fine-tuning to select the answer span. The resulting model obtains surprisingly good results on multiple benchmarks (e.g., 72.7 F1 on SQuAD with only 128 training examples), while maintaining competitive performance in the high-resource setting.(1)
引用
收藏
页码:3066 / 3079
页数:14
相关论文
共 50 条
[31]   Designing Informative Metrics for Few-Shot Example Selection [J].
Adiga, Rishabh ;
Subramanian, Lakshminarayanan ;
Chandrasekaran, Varun .
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, :10127-10135
[32]   Interpretable few-shot learning with online attribute selection [J].
Zarei, Mohammad Reza ;
Komeili, Majid .
NEUROCOMPUTING, 2025, 614
[33]   Joint span and token framework for few-shot named entity recognition [J].
Fang, Wenlong ;
Liu, Yongbin ;
Ouyang, Chunping ;
Ren, Lin ;
Li, Jiale ;
Wan, Yaping .
AI OPEN, 2023, 4 :111-119
[34]   Few-shot Unlearning [J].
Yoon, Youngsik ;
Nam, Jinhwan ;
Yun, Hyojeong ;
Lee, Jaeho ;
Kim, Dongwoo ;
Ok, Jungseul .
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, :3276-3292
[35]   Few-shot intent detection with self-supervised pretraining and prototype-aware attention [J].
Yang, Shun ;
Du, YaJun ;
Zheng, Xin ;
Li, XianYong ;
Chen, XiaoLiang ;
Li, YanLi ;
Xie, ChunZhi .
PATTERN RECOGNITION, 2024, 155
[36]   On Training Instance Selection for Few-Shot Neural Text Generation [J].
Chang, Ernie ;
Shen, Xiaoyu ;
Yeh, Hui-Syuan ;
Demberg, Vera .
ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, :8-13
[37]   SelectNAdapt: Support Set Selection for Few-Shot Domain Adaptation [J].
Dawoud, Youssef ;
Carneiro, Gustavo ;
Belagiannis, Vasileios .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, :973-982
[38]   A learnable support selection scheme for boosting few-shot segmentation [J].
Shao, Wenxuan ;
Qi, Hao ;
Dong, Xinghui .
PATTERN RECOGNITION, 2024, 148
[39]   Client selection based weighted federated few-shot learning [J].
Xu, Xinlei ;
Niu, Saisai ;
Zhe, Wanga ;
Li, Dongdong ;
Yang, Hai ;
Du, Wenli .
APPLIED SOFT COMPUTING, 2022, 128
[40]   An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling [J].
Wang, Peiyi ;
Xu, Runxin ;
Liu, Tianyu ;
Zhou, Qingyu ;
Cao, Yunbo ;
Chang, Baobao ;
Sui, Zhifang .
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, :5012-5024