Exploiting label dependencies for improved sample complexity

被引:17
作者
Chekina, Lena [1 ,2 ]
Gutfreund, Dan [3 ]
Kontorovich, Aryeh [4 ]
Rokach, Lior [1 ,2 ]
Shapira, Bracha [1 ,2 ]
机构
[1] Ben Gurion Univ Negev, Dept Informat Syst Engn, IL-84105 Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Telekom Innovat Labs, IL-84105 Beer Sheva, Israel
[3] IBM Res, Haifa, Israel
[4] Ben Gurion Univ Negev, Dept Comp Sci, IL-84105 Beer Sheva, Israel
关键词
Multi-label classification; Conditional and unconditional label dependence; Generalization bounds; Multi-label evaluation measures; Ensemble learning algorithms; Ensemble models diversity; Empirical experiment; Artificial datasets;
D O I
10.1007/s10994-012-5312-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label classification exhibits several challenges not present in the binary case. The labels may be interdependent, so that the presence of a certain label affects the probability of other labels' presence. Thus, exploiting dependencies among the labels could be beneficial for the classifier's predictive performance. Surprisingly, only a few of the existing algorithms address this issue directly by identifying dependent labels explicitly from the dataset. In this paper we propose new approaches for identifying and modeling existing dependencies between labels. One principal contribution of this work is a theoretical confirmation of the reduction in sample complexity that is gained from unconditional dependence. Additionally, we develop methods for identifying conditionally and unconditionally dependent label pairs; clustering them into several mutually exclusive subsets; and finally, performing multi-label classification incorporating the discovered dependencies. We compare these two notions of label dependence (conditional and unconditional) and evaluate their performance on various benchmark and artificial datasets. We also compare and analyze labels identified as dependent by each of the methods. Moreover, we define an ensemble framework for the new methods and compare it to existing ensemble methods. An empirical comparison of the new approaches to existing base-line and state-of-the-art methods on 12 various benchmark datasets demonstrates that in many cases the proposed single-classifier and ensemble methods outperform many multi-label classification algorithms. Perhaps surprisingly, we discover that the weaker notion of unconditional dependence plays the decisive role.
引用
收藏
页码:1 / 42
页数:42
相关论文
共 28 条
[1]  
[Anonymous], 2010, 2 INT WORKSH LEARN M
[2]  
[Anonymous], 2010, P ACM SIGKDD
[3]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[4]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[5]  
Dembczynski K., 2010, P ICML 2010 HAIF ISR
[6]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[7]   The VC dimension of k-fold union [J].
Eisenstat, David ;
Angluin, Dana .
INFORMATION PROCESSING LETTERS, 2007, 101 (05) :181-184
[8]   k-Fold unions of low-dimensional concept classes [J].
Eisenstat, David .
INFORMATION PROCESSING LETTERS, 2009, 109 (23-24) :1232-1234
[9]  
Ghamrawi N, 2005, P 14 ACM INT C INF K, DOI DOI 10.1145/1099554.1099591
[10]  
Jianhua Xu, 2010, Proceedings of the 2010 IEEE International Conference on Granular Computing (GrC-2010), P817, DOI 10.1109/GrC.2010.107