An Empirical Comparison Of Feature Selection Methods In Problem Transformation Multi-label Classification

被引:4
作者
Rodriguez, J. M. [1 ]
Godoy, D. [2 ]
Zunino, A. [2 ]
机构
[1] Univ Nacl Ctr Prov Buenos Aires UNICEN, ISISTAN, CONICET, Tandil, Buenos Aires, Argentina
[2] UNICEN CONICET, ISISTAN, Tandil, Buenos Aires, Argentina
关键词
Multi-label Classification; Feature Selection; Problem Transformation Classification; Binary Relevance; Pair-Wise; HOMER;
D O I
10.1109/TLA.2016.7786364
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-label classification (MLC) is a supervised learning problem in which a particular example can be associated with a set of labels instead of a single one as in traditional classification. Many real-world applications, such as Web page classification or resource tagging on the Social Web, are challenging for existing MLC algorithms, because the label space grows exponentially as instance space increases. Under the problem transformation approach, the most common alternative for MLC, multi-label problems are transformed into several single label problems, whose outputs are then aggregated into a prediction to the whole classification problem. Feature selection techniques become crucial in large-scale MLC problems to help reducing dimensionality. However, the impact of feature selection in multi-label setting has not been as extensively studied as in the case of single-label data. In this paper, we present an empirical evaluation of feature selection techniques in the context of the three main problem transformation MLC methods: Binary Relevance, Pair-wise and Label power-set. Experimentation was performed across a number of benchmark datasets for multi-label classification exhibiting varied characteristics, which allows observing the behavior of techniques and assessing their impact according to multiple metrics.
引用
收藏
页码:3784 / 3791
页数:8
相关论文
共 33 条
[1]  
[Anonymous], 2008, ECML PKDD 2008 WORKS
[2]  
[Anonymous], 2008, Proceedings of the ECML/PKDD Discovery Chanllenge
[3]  
[Anonymous], 1998, FAST TRAINING SUPPOR
[4]  
[Anonymous], 2001, Lecture Notes in Computer Science
[5]  
[Anonymous], 1998, CORRELATION BASED FE
[6]  
[Anonymous], 1994, MACHINE LEARNING P 1, DOI DOI 10.1016/B978-1-55860-335-6.50023-4
[7]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[8]  
Cerri R, 2009, LECT N BIOINFORMAT, V5676, P109, DOI 10.1007/978-3-642-03223-3_10
[9]  
Doquire G, 2011, LECT NOTES COMPUT SC, V6691, P9, DOI 10.1007/978-3-642-21501-8_2
[10]  
Forman G., 2003, Journal of Machine Learning Research, V3, P1289, DOI 10.1162/153244303322753670