Multi-label classification with label clusters

被引:0
作者
Gatto, Elaine Cecilia [1 ]
Ferrandin, Mauri [2 ]
Cerri, Ricardo [3 ]
机构
[1] Univ Fed Sao Carlos, Dept Comp Sci, BR-13565905 Sao Carlos, SP, Brazil
[2] Univ Fed Santa Catarina, Dept Control Automat & Comp Engn, BR-89036002 Blumenau, SC, Brazil
[3] Univ Sao Paulo, Inst Math & Comp Sci, Ave Trabalhador Sao Carlense,400 Ctr, BR-13566590 Sao Carlos, SP, Brazil
关键词
Multi-label correlations; Multi-label partitioning; Multi-label clustering; Multi-label classification; Multi-label learning; CLASSIFIERS; DEPENDENCE; ENSEMBLES;
D O I
10.1007/s10115-024-02270-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label classification is the task of simultaneously predicting a set of labels for an instance, with global and local being the two predominant approaches. The global approach trains a single classifier to handle all classes simultaneously, while the local approach breaks down the problem into multiple binary problems. Despite extensive research, effectively capturing label correlations remains a challenge in both methods. In this paper, we introduce an approach that clusters the label space to create hybrid partitions (disjoint correlated label clusters), striking a balance between global and local strategies while leveraging both advantages. Our approach consists of (i) clustering the label space based on correlations, (ii) generating and validating the resulting hybrid partitions, (iii) selecting the best partitions, and (iv) evaluating their performance. We also compare our approach against an oracle, exhaustive search, and random search to assess how closely our hybrid partitions approximate the best possible partitions. The oracle selects the best partition using the test set, while the exhaustive approach relies on validation data. Experiments conducted on multiple multi-label datasets demonstrate that our method, along with random partitions, achieves results that are superior or competitive compared to traditional global and local approaches, as well as the state-of-the-art Ensemble of Classifier Chains. These findings suggest that conventional methods may not fully capture label correlations, and clustering the label space offers a promising solution.
引用
收藏
页码:1741 / 1785
页数:45
相关论文
共 68 条
  • [1] Abeyrathna DLBGM, 2018, MULTILABEL CLASSIFIC
  • [2] [Anonymous], 2010, THESIS U WAIKATO
  • [3] Multi-Label learning in the independent label sub-spaces
    Barezi, Elham J.
    Kwok, James T.
    Rabiee, Hamid R.
    [J]. PATTERN RECOGNITION LETTERS, 2017, 97 : 8 - 12
  • [4] Beyond global and local multi-target learning
    Basgalupp, Marcio
    Cerri, Ricardo
    Schietgat, Leander
    Triguero, Isaac
    Vens, Celine
    [J]. INFORMATION SCIENCES, 2021, 579 : 508 - 524
  • [5] Fast Component Density Clustering in Spatial Databases: A Novel Algorithm
    Bataineh, Bilal
    [J]. INFORMATION, 2022, 13 (10)
  • [6] Comprehensive comparative study of multi-label classification methods
    Bogatinovski, Jasmin
    Todorovski, Ljupco
    Dzeroski, Saso
    Kocev, Dragi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
  • [7] Learning multi-label scene classification
    Boutell, MR
    Luo, JB
    Shen, XP
    Brown, CM
    [J]. PATTERN RECOGNITION, 2004, 37 (09) : 1757 - 1771
  • [8] Tips, guidelines and tools for managing multi-label datasets: The mldr.datasets R package and the Cometa data repository
    Charte, Francisco
    Rivera, Antonio J.
    Charte, David
    del Jesus, Mara J.
    Herrera, Francisco
    [J]. NEUROCOMPUTING, 2018, 289 : 68 - 85
  • [9] Enhancement of DNN-based multilabel classification by grouping labels based on data imbalance and label correlation
    Chen, Ling
    Wang, Yuhong
    Li, Hao
    [J]. PATTERN RECOGNITION, 2022, 132
  • [10] DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method
    Chu, Yanyi
    Shan, Xiaoqi
    Chen, Tianhang
    Jiang, Mingming
    Wang, Yanjing
    Wang, Qiankun
    Salahub, Dennis Russell
    Xiong, Yi
    Wei, Dong-Qing
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)