Multi-label text classification based on the label correlation mixture model

被引:6
作者
He, Zhiyang [1 ]
Wu, Ji [1 ]
Lv, Ping [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[2] Tsinghua iFlytek Joint Lab Speech Technol, Beijing, Peoples R China
关键词
Label correlation mixture model; probabilistic generative model; multi-label text classification; label correlation model; label correlation network; Bayes decision theory; DESIGN;
D O I
10.3233/IDA-163055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the current paper, we propose a probabilistic generative model, the label correlation mixture model (LCMM), to depict multi-labeled document data, which can be utilized for multi-label text classification. LCMM assumes two stochastic generative processes, which correspond to two submodels: 1) a label correlation model; and 2) a label mixture model. The former model formulates labels' generative process, in which a label correlation network is created to depict the dependency between labels. Moreover, we present an efficient inference algorithm for calculating the generative probability of a multi-label class. Furthermore, in order to optimize the label correlation network, we propose a parameter-learning algorithm based on gradient descent. The second submodel in the LCMM depicts the generative process of words in a document with the given labels. Different traditional mixture models can be adopted in this generative process, such as the mixture of language models, or topic models. In the multi-label classification stage, we propose a two-step strategy to most efficiently utilize the LCMM based on the framework of Bayes decision theory. We conduct extensive multi-label classification experiments on three standard text data sets. The experimental results show significant performance improvements comparing to existing approaches. For example, the improvements on accuracy and macro F-score measures in the OHSUMED data set achieve 28.3% and 37.0%, respectively. These performance enhancements demonstrate the effectiveness of the proposed models and solutions.
引用
收藏
页码:1371 / 1392
页数:22
相关论文
共 35 条
[1]  
[Anonymous], 2005, PARAMETER ESTIMATION
[2]  
[Anonymous], 1997, ICML
[3]  
[Anonymous], 2002, Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
[4]  
[Anonymous], 2006, PATTERN RECOGN
[5]  
Baum LE, 1972, Inequalities, V3, P1
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[8]   An analysis of the relative hardness of Reuters-21578 subsets [J].
Debole, F ;
Sebastiani, F .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (06) :584-596
[9]  
Gao S., 2004, P 21 INT C MACHINE L, P42, DOI 10.1145/1015330.1015361
[10]   A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization [J].
Gao, Sheng ;
Wu, Wen ;
Lee, Chin-Hui ;
Chua, Tat-Seng .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2006, 24 (02) :190-218