MINPROMPT: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering

被引:0
|
作者
Chen, Xiusi [1 ]
Jiang, Jyun-Yu [2 ]
Chang, Wei-Cheng [2 ]
Hsieh, Cho-Jui [1 ]
Yu, Hsiang-Fu [2 ]
Wang, Wei [1 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
[2] Amazon Search, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in few-shot question answering (QA) mostly rely on the power of pre-trained large language models (LLMs) and fine-tuning in specific settings. Although the pre-training stage has already equipped LLMs with powerful reasoning capabilities, LLMs still need to be fine-tuned to adapt to specific domains to achieve the best results. In this paper, we propose to select the most informative data for fine-tuning, thereby improving the efficiency of the fine-tuning process with comparative or even better accuracy on the open-domain QA task. We present MINPROMPT, a minimal data augmentation framework for open-domain QA based on an approximate graph algorithm and unsupervised question generation. We transform the raw text into a graph structure to build connections between different factual sentences, then apply graph algorithms to identify the minimal set of sentences needed to cover the most information in the raw text. We then generate QA pairs based on the identified sentence subset and train the model on the selected sentences to obtain the final model. Empirical results on several benchmark datasets and theoretical analysis show that MINPROMPT is able to achieve comparable or better results than baselines with a high degree of efficiency, bringing consistent improvements in F-1 scores.
引用
收藏
页码:254 / 266
页数:13
相关论文
共 50 条
  • [31] Few-Shot Representation Learning for Knowledge Graph with Variational Auto-encoder Data Augmentation
    Wang, Ling
    Lu, Jicang
    Lu, Yinpeng
    Liu, Yan
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 359 - 375
  • [32] A load classification method based on data augmentation and few-shot machine learning
    Liu, Haoran
    Li, Huaqiang
    Yu, Xueying
    Wang, Ziyao
    Chen, Yipeng
    IET RENEWABLE POWER GENERATION, 2024,
  • [33] A lightweight approach based on prompt for few-shot relation extraction
    Zhang, Ying
    Huang, Wencheng
    Dang, Depeng
    COMPUTER SPEECH AND LANGUAGE, 2024, 84
  • [34] Prototypical Verbalizer for Prompt-based Few-shot Tuning
    Cui, Ganqu
    Hu, Shengding
    Ding, Ning
    Huang, Longtao
    Liu, Zhiyuan
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7014 - 7024
  • [35] Prompt-Based Metric Learning for Few-Shot NER
    Chen, Yanru
    Zheng, Yanan
    Yang, Zhilin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7199 - 7212
  • [36] Fault diagnosis of EHA with few-shot data augmentation technique
    Chen, Huanguo
    Miao, Xu
    Mao, Wentao
    Zhao, Shoujun
    Yang, Gaopeng
    Bo, Yan
    SMART MATERIALS AND STRUCTURES, 2023, 32 (04)
  • [37] Log Parsing with Prompt-based Few-shot Learning
    Le, Van-Hoang
    Zhang, Hongyu
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2438 - 2449
  • [38] Few-Shot Learning With Enhancements to Data Augmentation and Feature Extraction
    Zhang, Yourun
    Gong, Maoguo
    Li, Jianzhao
    Feng, Kaiyuan
    Zhang, Mingyang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 14
  • [39] Graph-based answer fusion in multilingual question answering
    Aceves-Perez, Rita M.
    Montes-y-Gomez, Manuel
    Villasenor-Pineda, Luis
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 621 - 629
  • [40] GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constraints
    Chen, Di
    Liu, Yu
    Huang, Lianghua
    Wang, Bin
    Pan, Pan
    COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 : 322 - 337