CPCL: Conceptual prototypical contrastive learning for Few-Shot text classification

被引:1
作者
Cheng, Tao [1 ]
Cheng, Hua [1 ]
Fang, Yiquan [1 ]
Liu, Yufei [1 ]
Gao, Caiting [1 ]
机构
[1] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
关键词
Prototypical network; text classification; Few-Shot learning; prompt learning;
D O I
10.3233/JIFS-231570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As prototype-based Few-Shot Learning methods, Prototypical Network generates prototypes for each class in a low-resource state and classify by a metric module. Therefore, the quality of prototypes matters but they are inaccurate from the few support instances, and the domain-specific information of training data are harmful to the generalizability of prototypes. We propose a Conceptual Prototype (CP), which contains both rich instance and concept features. The numerous query data can inspire the few support instances. An interactive network is designed to leverage the interrelation between support set and query-detached set to acquire a rich Instance Prototype which is typical on the whole data. Besides, class labels are introduced to prototype by prompt engineering, which makes it more conceptual. The label-only concept makes prototype immune to domain-specific information in training phase to improve its generalizability. Based on CP, Conceptual Prototypical Contrastive Learning (CPCL) is proposed where PCL brings instances closer to its corresponding prototype and pushes away from other prototypes. "2-way 5-shot" experiments show that CPCL achieves 92.41% accuracy on ARSC dataset, 2.30% higher than other prototype-based models. Meanwhile, the 0-shot performance of CPCL is comparable to Induction Network in the 5-shot way, indicating that our model is adequate for 0-shot tasks.
引用
收藏
页码:11963 / 11975
页数:13
相关论文
共 38 条
[1]   Enhancing Few-Shot Image Classification with Unlabelled Examples [J].
Bateni, Peyman ;
Barber, Jarred ;
van de Meent, Jan-Willem ;
Wood, Frank .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1597-1606
[2]  
Brown TB, 2020, ADV NEUR IN, V33
[3]  
Chen JF, 2022, AAAI CONF ARTIF INTE, P10492
[4]  
Chen Mark, 2021, Evaluating large language models trained on code
[5]  
Chen T, 2020, PR MACH LEARN RES, V119
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]   Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes [J].
Donnelly, Jon ;
Barnett, Alina Jade ;
Chen, Chaofan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :10255-10265
[8]  
Gao TY, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P6250
[9]  
Gao TY, 2019, AAAI CONF ARTIF INTE, P6407
[10]  
Geng RY, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3904