A weakly supervised textual entailment approach to zero-shot text classification

被引:0
作者
Pamies, Marc [1 ]
Llop, Joan [1 ]
Multari, Francesco [2 ]
Duran-Silva, Nicolau [2 ]
Parra-Rojas, Cesar [2 ]
Gonzalez-Agirre, Aitor [1 ]
Massucci, Francesco Alessandro [2 ]
Villegas, Marta [1 ]
机构
[1] Barcelona Supercomp Ctr, Barcelona, Spain
[2] SIRIS Acad, Barcelona, Spain
来源
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot text classification is a widely studied task that deals with a lack of annotated data. The most common approach is to reformulate it as a textual entailment problem, enabling classification into unseen classes. This work explores an effective approach that trains on a weakly supervised dataset generated from traditional classification data. We empirically study the relation between the performance of the entailment task, which is used as a proxy, and the target zero-shot text classification task. Our findings reveal that there is no linear correlation between both tasks, to the extent that it can be detrimental to lengthen the fine-tuning process even when the model is still learning, and propose a straightforward method to stop training on time. As a proof of concept, we introduce a domain-specific zero-shot text classifier that was trained on Microsoft Academic Graph data. The model, called SCIroShot, achieves stateof-the-art performance in the scientific domain and competitive results in other areas. Both the model and evaluation benchmark are publicly available on HuggingFace1 and GitHub2.
引用
收藏
页码:286 / 296
页数:11
相关论文
共 39 条
[1]   Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from established and new literature databases [J].
Bornmann, Lutz ;
Haunschild, Robin ;
Mutz, Rudiger .
HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2021, 8 (01)
[2]  
Brown TB, 2020, ADV NEUR IN, V33
[3]  
Chang M.-W., 2008, Aaai, V2, P830
[4]  
Chen XH, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P2195
[5]  
Cohan A., 2020, ACL
[6]  
Conneau A, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2475
[7]  
Dagan I, 2006, LECT NOTES ARTIF INT, V3944, P177
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
Dewey Melvin., 1876, CLASSIFICATION SUBJE
[10]   Long Document Classification From Local Word Glimpses via Recurrent Attention Learning [J].
He, Jun ;
Wang, Liqun ;
Liu, Liu ;
Feng, Jiao ;
Wu, Hao .
IEEE ACCESS, 2019, 7 :40707-40718