Cloze-Style Data Augmentation for Few-Shot Intent Recognition

被引:1
|
作者
Zhang, Xin [1 ]
Jiang, Miao [1 ]
Chen, Honghui [1 ]
Chen, Chonghao [1 ]
Zheng, Jianming [1 ]
机构
[1] Natl Univ Def & Technol, Sci & Technol Informat Syst Engn Lab, 109 Deya St, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
few-shot learning; intent recognition; data augmentation; pretrained language model;
D O I
10.3390/math10183358
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Intent recognition aims to identify users' potential intents from their utterances, which is a key component in task-oriented dialog systems. A real challenge, however, is that the number of intent categories has grown faster than human-annotated data, resulting in only a small amount of data being available for many new intent categories. This lack of data leads to the overfitting of traditional deep neural networks on a small amount of training data, which seriously affects practical applications. Hence, some researchers have proposed few-shot learning should address the data-scarcity issue. One of the efficient methods is text augmentation, which always generates noisy or meaningless data. To address these issues, we propose leveraging the knowledge in pre-trained language models and constructed the cloze-style data augmentation (CDA) model. We employ unsupervised learning to force the augmented data to be semantically similar to the initial input sentences and contrastive learning to enhance the uniqueness of each category. Experimental results on CLINC-150 and BANKING-77 datasets show the effectiveness of our proposal by its beating of the competitive baselines. In addition, we conducted an ablation study to verify the function of each module in our models, and the results illustrate that the contrastive learning module plays the most important role in improving the recognition accuracy.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Clozer: Adaptable Data Augmentation for Cloze-style Reading Comprehension
    Lovenia, Holy
    Wilie, Bryan
    Chung, Willy
    Zeng, Min
    Cahyawijaya, Samuel
    Dan, Su
    Fung, Pascale
    PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2022, : 60 - 66
  • [2] Few-Shot Intent Detection by Data Augmentation and Class Knowledge Transfer
    Guo, Zhijun
    Niu, Kun
    Chen, Xiao
    Liu, Qi
    Li, Xiao
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 458 - 462
  • [3] Gotta: Generative Few-shot Question Answering by Prompt-based Cloze Data Augmentation
    Chen, Xiusi
    Zhang, Yu
    Deng, Jinliang
    Jiang, Jyun-Yu
    Wang, Wei
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 909 - 917
  • [4] Self-Supervised Task Augmentation for Few-Shot Intent Detection
    Peng-Fei Sun
    Ya-Wen Ouyang
    Ding-Jie Song
    Xin-Yu Dai
    Journal of Computer Science and Technology, 2022, 37 : 527 - 538
  • [5] Self-Supervised Task Augmentation for Few-Shot Intent Detection
    Sun, Peng-Fei
    Ouyang, Ya-Wen
    Song, Ding-Jie
    Dai, Xin-Yu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (03) : 527 - 538
  • [6] Few-shot dysarthric speech recognition with text-to-speech data augmentation
    Hermann, Enno
    Magimai-Doss, Mathew
    INTERSPEECH 2023, 2023, : 156 - 160
  • [7] Data Augmentation with Nearest Neighbor Classifier for Few-Shot Named Entity Recognition
    Ge, Yao
    Al-Garadi, Mohammed Ali
    Sarker, Abeed
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 690 - 694
  • [8] Few-Shot Charge Prediction with Data Augmentation and Feature Augmentation
    Wang, Peipeng
    Zhang, Xiuguo
    Cao, Zhiying
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [9] Few-shot imbalanced classification based on data augmentation
    Chao, Xuewei
    Zhang, Lixin
    MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2843 - 2851
  • [10] Few-shot learning through contextual data augmentation
    Arthaud, Farid
    Bawden, Rachel
    Birch, Alexandra
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1049 - 1062