ML-FOREST: A Multi-Label Tree Ensemble Method for Multi-Label Classification

被引:72
作者
Wu, Qingyao [1 ]
Tan, Mingkui [1 ]
Song, Hengjie [1 ]
Chen, Jian [1 ]
Ng, Michael K. [2 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou 510641, Guangdong, Peoples R China
[2] Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-label classification; label dependency; label transfer; tree classifier; ensemble methods;
D O I
10.1109/TKDE.2016.2581161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label classification deals with the problem where each example is associated with multiple class labels. Since the labels are often dependent to other labels, exploiting label dependencies can significantly improve the multi-label classification performance. The label dependency in existing studies is often given as prior knowledge or learned from the labels only. However, in many real applications, such prior knowledge may not be available, or labeled information might be very limited. In this paper, we propose a new algorithm, called ML-FOREST, to learn an ensemble of hierarchical multi-label classifier trees to reveal the intrinsic label dependencies. In ML-FOREST, we construct a set of hierarchical trees, and develop a label transfer mechanism to identify the multiple relevant labels in a hierarchical way. In general, the relevant labels at higher levels of the trees capture more discriminable label concepts, and they will be transferred into lower level children nodes that are harder to discriminate. The relevant labels in the hierarchy are then aggregated to compute label dependency and make the final prediction. Our empirical study shows encouraging results of the proposed algorithm in comparison with the state-of-the-art multi-label classification algorithms under Friedman test and post-hoc Nemenyi test.
引用
收藏
页码:2665 / 2680
页数:16
相关论文
共 50 条
[31]  
Qi G.J., 2007, P 15 ACM INT C MULTI, P17, DOI DOI 10.1145/1291233.1291245
[32]   Multi-label Classification with Meta-labels [J].
Read, Jesse ;
Puurula, Antti ;
Bifet, Albert .
2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, :941-946
[33]   Classifier chains for multi-label classification [J].
Read, Jesse ;
Pfahringer, Bernhard ;
Holmes, Geoff ;
Frank, Eibe .
MACHINE LEARNING, 2011, 85 (03) :333-359
[34]   Statistical topic models for multi-label document classification [J].
Rubin, Timothy N. ;
Chambers, America ;
Smyth, Padhraic ;
Steyvers, Mark .
MACHINE LEARNING, 2012, 88 (1-2) :157-208
[35]   On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data [J].
Schwarz, Daniel F. ;
Koenig, Inke R. ;
Ziegler, Andreas .
BIOINFORMATICS, 2010, 26 (14) :1752-1758
[36]   Multi-Label Image Categorization With Sparse Factor Representation [J].
Sun, Fuming ;
Tang, Jinhui ;
Li, Haojie ;
Qi, Guo-Jun ;
Huang, Thomas S. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (03) :1028-1037
[37]  
Sun L., 2008, P 14 ACM SIGKDD INT, P668
[38]  
Tsang IW, 2005, J MACH LEARN RES, V6, P363
[39]  
Tsoumakas G, 2008, P ECML PKDD 2008 WOR, P53, DOI DOI 10.1007/978-3-642-12837-0_11
[40]  
Tsoumakas G, 2010, J MACH LEARN RES, V1, P1