A weakly supervised textual entailment approach to zero-shot text classification

被引：0

作者：

Pamies, Marc ^{[1
]}

Llop, Joan ^{[1
]}

Multari, Francesco ^{[2
]}

Duran-Silva, Nicolau ^{[2
]}

Parra-Rojas, Cesar ^{[2
]}

Gonzalez-Agirre, Aitor ^{[1
]}

Massucci, Francesco Alessandro ^{[2
]}

Villegas, Marta ^{[1
]}

机构：

[1] Barcelona Supercomp Ctr, Barcelona, Spain

[2] SIRIS Acad, Barcelona, Spain

来源：

17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年

基金：

欧盟地平线“2020”;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Zero-shot text classification is a widely studied task that deals with a lack of annotated data. The most common approach is to reformulate it as a textual entailment problem, enabling classification into unseen classes. This work explores an effective approach that trains on a weakly supervised dataset generated from traditional classification data. We empirically study the relation between the performance of the entailment task, which is used as a proxy, and the target zero-shot text classification task. Our findings reveal that there is no linear correlation between both tasks, to the extent that it can be detrimental to lengthen the fine-tuning process even when the model is still learning, and propose a straightforward method to stop training on time. As a proof of concept, we introduce a domain-specific zero-shot text classifier that was trained on Microsoft Academic Graph data. The model, called SCIroShot, achieves stateof-the-art performance in the scientific domain and competitive results in other areas. Both the model and evaluation benchmark are publicly available on HuggingFace1 and GitHub2.

引用

页码：286 / 296

页数：11

共 39 条

[1] Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from established and new literature databases [J].

Bornmann, Lutz ;

Haunschild, Robin ;

Mutz, Rudiger .

HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2021, 8 (01)

[2]

Brown TB, 2020, ADV NEUR IN, V33

[3]

Chang M.-W., 2008, Aaai, V2, P830

[4]

Chen XH, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), P2195

[5]

Cohan A., 2020, ACL

[6]

Conneau A, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2475

[7]

Dagan I, 2006, LECT NOTES ARTIF INT, V3944, P177

[8]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[9]

Dewey Melvin., 1876, CLASSIFICATION SUBJE

[10] Long Document Classification From Local Word Glimpses via Recurrent Attention Learning [J].

He, Jun ;

Wang, Liqun ;

Liu, Liu ;

Feng, Jiao ;

Wu, Hao .

IEEE ACCESS, 2019, 7 :40707-40718

← 1 2 3 4 →