Zero-Shot Text Classification with Semantically Extended Textual Entailment

被引:0
作者
Liu, Tengfei [1 ]
Hu, Yongli [1 ]
Chen, Puman [1 ]
Sun, Yanfeng [1 ]
Yin, Baocai [1 ]
机构
[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Fac Informat Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Zero-shot text classification; semantic extension; text entailment;
D O I
10.1109/IJCNN54540.2023.10191094
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot text classification (0SHOT-TC) aims to detect classes that the model never seen in the training set, and has attracted much attention in the research community of Natural Language Processing (NLP). The emergence of pretrained language models has fostered the progress of 0SHOT-TC, which turns the task into a textual entailment problem of binary classification. It learns an entailment relatedness (yes/no) between the given sentence (premise) and each category (hypothesis) separately. However, the hypothesis generation paradigms need to be further studied, since the label itself or the label descriptions have limited ability to fully express the category space. Conversely, humans can easily extend a set of words describing the categories to be classified. In this paper, we propose a novel zero-shot text classification method called Semantically Extended Textual Entailment (SETE), which imitates the human's ability in knowledge extension. In the proposed method, three semantic extension methods are used to enrich the categories through a combination of static knowledge (e.g. expert knowledge, knowledge graph) and dynamic knowledge (e.g. language models), and the textual entailment model is finally used for 0SHOT-TC. The experimental results on the benchmarks show that our approach significantly outperforms the current methods in both generalized and nongeneralized 0SHOT-TC.
引用
收藏
页数:8
相关论文
empty
未找到相关数据