An extensive experimental comparison of methods for multi-label learning

被引:524
作者
Madjarov, Gjorgji [1 ,2 ]
Kocev, Dragi [2 ]
Gjorgjevikj, Dejan [1 ]
Dzeroski, Saso [2 ]
机构
[1] Ss Cyril & Methodius Univ, Fac Comp Sci & Engn, Skopje 1000, North Macedonia
[2] Jozef Stefan Inst, Dept Knowledge Technol, Ljubljana 1000, Slovenia
关键词
Multi-label ranking; Multi-label classification; Comparison of multi-label learning methods; CLASSIFICATION; ALGORITHMS;
D O I
10.1016/j.patcog.2012.03.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label learning has received significant attention in the research community over the past few years: this has resulted in the development of a variety of multi-label learning methods. In this paper, we present an extensive experimental comparison of 12 multi-label learning methods using 16 evaluation measures over 11 benchmark datasets. We selected the competing methods based on their previous usage by the community, the representation of different groups of methods and the variety of basic underlying machine learning methods. Similarly, we selected the evaluation measures to be able to assess the behavior of the methods from a variety of view-points. In order to make conclusions independent from the application domain, we use 11 datasets from different domains. Furthermore, we compare the methods by their efficiency in terms of time needed to learn a classifier and time needed to produce a prediction for an unseen example. We analyze the results from the experiments using Friedman and Nemenyi tests for assessing the statistical significance of differences in performance. The results of the analysis show that for multi-label classification the best performing methods overall are random forests of predictive clustering trees (RF-PCT) and hierarchy of multi-label classifiers (HOMER), followed by binary relevance (BR) and classifier chains (CC). Furthermore, RF-PCT exhibited the best performance according to all measures for multi-label ranking. The recommendation from this study is that when new methods for multi-label learning are proposed, they should be compared to RF-PCT and HOMER using multiple evaluation measures. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3084 / 3104
页数:21
相关论文
共 46 条
  • [1] [Anonymous], P 9 INT C MUS INF RE
  • [2] [Anonymous], P IEEE AER C
  • [3] An empirical comparison of voting classification algorithms: Bagging, boosting, and variants
    Bauer, E
    Kohavi, R
    [J]. MACHINE LEARNING, 1999, 36 (1-2) : 105 - 139
  • [4] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669
  • [5] Blockeel H., 1998, Machine Learning. Proceedings of the Fifteenth International Conference (ICML'98), P55
  • [6] Learning multi-label scene classification
    Boutell, MR
    Luo, JB
    Shen, XP
    Brown, CM
    [J]. PATTERN RECOGNITION, 2004, 37 (09) : 1757 - 1771
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Brinker K, 2006, FRONT ARTIF INTEL AP, V141, P489
  • [9] Centers for Disease Control and Prevention (CDC) National Center for Health Statistics, 2011, INT CLASS DIS 9 REV
  • [10] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)