Binary relevance efficacy for multilabel classification

被引:185
作者
Luaces, Oscar [1 ]
Diez, Jorge [1 ]
Barranquero, Jose [1 ]
Jose del Coz, Juan [1 ]
Bahamonde, Antonio [1 ]
机构
[1] Univ Oviedo Gijon, Artificial Intelligence Ctr, Campus Viesques, Gijon 33204, Asturias, Spain
关键词
Multilabel classification; Binary relevance; Synthetic datasets; Label dependency;
D O I
10.1007/s13748-012-0030-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of multilabel (ML) classification is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classification is binary relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study of ML benchmarks datasets and point out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with different characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in difficult problems with many labels, a conclusion which was not stated by previous studies.
引用
收藏
页码:303 / 313
页数:11
相关论文
共 22 条
[1]   Multi-dimensional classification with Bayesian networks [J].
Bielza, C. ;
Li, G. ;
Larranaga, P. .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2011, 52 (06) :705-727
[2]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[3]   Combining instance-based learning and logistic regression for multilabel classification [J].
Cheng, Weiwei ;
Huellermeier, Eyke .
MACHINE LEARNING, 2009, 76 (2-3) :211-225
[4]  
Dembczynski K., 2010, ICML
[5]  
Dembczynski K., 2011, ADV NEURAL INFORM PR, V24, P1404
[6]  
Elisseeff A, 2002, ADV NEUR IN, V14, P681
[7]  
Godbole S, 2004, LECT NOTES ARTIF INT, V3056, P22
[8]  
Joachims T., 2005, P 22 INT C MACHINE L, P377, DOI DOI 10.1145/1102351.1102399
[9]  
Lastra G, 2011, LECT NOTES COMPUT SC, V7014, P246, DOI 10.1007/978-3-642-24800-9_24
[10]   An extensive experimental comparison of methods for multi-label learning [J].
Madjarov, Gjorgji ;
Kocev, Dragi ;
Gjorgjevikj, Dejan ;
Dzeroski, Saso .
PATTERN RECOGNITION, 2012, 45 (09) :3084-3104