Multi-label Text Classification Method Based on Label Semantic Information

被引:0
作者
Xiao L. [1 ]
Chen B.-L. [1 ]
Huang X. [1 ]
Liu H.-F. [1 ]
Jing L.-P. [1 ]
Yu J. [1 ]
机构
[1] Beijing Key Laboratory of Traffic Data Analysis and Mining (Beijing Jiaotong University), Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2020年 / 31卷 / 04期
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Attention mechanism; Label semantic; Multi-label; Text classification;
D O I
10.13328/j.cnki.jos.005923
中图分类号
学科分类号
摘要
Multi-label classification has been a practical and important problem since the boom of big data. There are many practical applications, such as text classification, image recognition, video annotation, multimedia information retrieval, etc. Traditional multi-label text classification algorithms regard labels as symbols without inherent semantics. However, in many scenarios these labels have specific semantics, and the semantic information of labels have corresponding relationship with the content information of the documents, in order to establish the connection between them and make use of them, a label semantic attention multi-label classification (LASA) method is proposed based on label semantic attention. The texts and labels of the document are relied on to share the word representation between the texts and labels. For documents embedding, bi-directional long short-term memory (Bi-LSTM) is used to obtain the hidden representation of each word. The weight of each word in the document is obtained by using the semantic representation of the label, thus taking into account the importance of each word to the current label. In addition, labels are often related to each other in the semantic space, by using the semantic information of the labels, the correlation of the labels is considered to improve the classification performance of the model. The experimental results on the standard multi-label classification datasets show that the proposed method can effectively capture important words, and its performance is better than the existing state-of-the-art multi-label classification algorithms. © Copyright 2020, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1079 / 1089
页数:10
相关论文
共 28 条
[1]  
Schapire R.E., Singer Y., Improved boosting algorithms using confidence-rated predictions, Machine Learning, 37, 3, pp. 297-336, (1999)
[2]  
Tang D., Qin B., Liu T., Document modeling with gated recurrent neural network for sentiment classification, Proc. of the 2015 Conf. on Empirical Methods in Natural Language Processing, pp. 1422-1432, (2015)
[3]  
Yang Z., Yang D., Dyer C., He X., Smola A., Hovy E., Hierarchical attention networks for document classification, Proc. of the 2016 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480-1489, (2016)
[4]  
Chen J., He J., Shen Y., Xiao L., He X., Gao J., Deng L., End-to-end learning of LDA by mirror-descent back propagation over a deep architecture, Advances in Neural Information Processing Systems, pp. 1765-1773, (2015)
[5]  
Yang P., Sun X., Li W., Ma S., Wu W., Wang H., SGM: Sequence generation model for multi-label classification, Proc. of the 27th Int'l Conf. on Computational Linguistics, pp. 3915-3926, (2018)
[6]  
You R., Dai S., Zhang Z., Mamitsuka H., Zhu S., Attentionxml: Extreme multi-label text classification with multi-label attention based recurrent neural networks, (2018)
[7]  
Boutell M.R., Luo J., Shen X., Brown C.M., Learning multi-label scene classification, Pattern Recognition, 37, 9, pp. 1757-1771, (2004)
[8]  
Tsoumakas G., Katakis I., Multi-label classification: An overview, Int'l Journal of Data Warehousing and Mining (IJDWM), 3, 3, pp. 1-13, (2007)
[9]  
Read J., Pfahringer B., Holmes G., Frank E., Classifier chains for multi-label classification, Proc. of the ECML'09: The 20th European Conf. on Machine Learning, pp. 254-269, (2009)
[10]  
Elisseeff A., Weston J., A kernel method for multi-labelled classification, Advances in neural Information Processing Systems, pp. 681-687, (2002)