Effective Collaborative Representation Learning for Multilabel Text Categorization

被引:17
作者
Wu, Hao [1 ]
Qin, Shaowei [1 ]
Nie, Rencan [1 ,2 ]
Cao, Jinde [3 ,4 ]
Gorbachev, Sergey [5 ]
机构
[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Yunnan, Peoples R China
[2] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[3] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[4] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
[5] Natl Res Tomsk State Univ, Dept Innovat Technol, Tomsk 634050, Russia
基金
中国国家自然科学基金;
关键词
Text categorization; Training; Predictive models; Electronic mail; Data models; Collaboration; Biological system modeling; Collaborative representation learning (CRL); matrix factorization; multitask learning (MTL); neural networks; text categorization;
D O I
10.1109/TNNLS.2021.3069647
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the booming of deep learning, massive attention has been paid to developing neural models for multilabel text categorization (MLTC). Most of the works concentrate on disclosing word-label relationship, while less attention is taken in exploiting global clues, particularly with the relationship of document-label. To address this limitation, we propose an effective collaborative representation learning (CRL) model in this article. CRL consists of a factorization component for generating shallow representations of documents and a neural component for deep text-encoding and classification. We have developed strategies for jointly training those two components, including an alternating-least-squares-based approach for factorizing the pointwise mutual information (PMI) matrix of label-document and multitask learning (MTL) strategy for the neural component. According to the experimental results on six data sets, CRL can explicitly take advantage of the relationship of document-label and achieve competitive classification performance in comparison with some state-of-the-art deep methods.
引用
收藏
页码:5200 / 5214
页数:15
相关论文
共 43 条
  • [1] Aly R, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, P323
  • [2] [Anonymous], 2008, P LREC 2008 WORKSH S
  • [3] AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION
    APTE, C
    DAMERAU, F
    WEISS, SM
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) : 233 - 251
  • [4] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [5] Bhatia K., 2015, Advances in Neural Information Processing Systems, P730
  • [6] Chen GB, 2017, IEEE IJCNN, P2377, DOI 10.1109/IJCNN.2017.7966144
  • [7] Chung J, 2014, CORR, P1
  • [8] Conneau A, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P1107
  • [9] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [10] 2-9