Joint Representations of Texts and Labels with Compositional Loss for Short Text Classification

被引：3

作者：

Hao, Ming ^{[1
]}

Wang, Weijing ^{[2
]}

Zhou, Fang ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China

[2] Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA

来源：

JOURNAL OF WEB ENGINEERING | 2021年 / 20卷 / 03期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Ambiguous text; deep language models; label embedding; text classification; triplet loss;

D O I：

10.13052/jwe1540-9589.2035

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.

引用

页码：669 / 687

页数：19

共 39 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2] Label-Embedding for Image Classification [J].

Akata, Zeynep ;

Perronnin, Florent ;

Harchaoui, Zaid ;

Schmid, Cordelia .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) :1425-1438

[3] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J].

Anderson, Peter ;

Wu, Qi ;

Teney, Damien ;

Bruce, Jake ;

Johnson, Mark ;

Sunderhauf, Niko ;

Reid, Ian ;

Gould, Stephen ;

van den Hengel, Anton .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3674-3683

[4]

[Anonymous], 2016, FAST TRAINING TRIPLE

[5]

Chelba Ciprian, 2014, One billion word benchmark for measuring progress in statistical language modeling

[6]

Chen J., 2019, Deep short text classification with knowledge powered attention

[7]

Chen Weihua, 2017, Beyond triplet loss: a deep quadruplet network for person re-identification

[8] Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function [J].

Cheng, De ;

Gong, Yihong ;

Zhou, Sanping ;

Wang, Jinjun ;

Zheng, Nanning .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1335-1344

[9]

Conneau A., 2017, Very deep convolutional networks for text classification

[10]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

← 1 2 3 4 →