Feature Ranking for Hierarchical Multi-Label Classification with Tree Ensemble Methods

被引:7
作者
Petkovic, Matej [1 ]
Dzeroski, Saso
Kocev, Dragi
机构
[1] Jozef Stefan Inst, Jamova 39, Ljubljana 1000, Slovenia
关键词
hierarchical multi-label classification; feature ranking; ensemble methods; Relief; RELIEFF;
D O I
10.12700/APH.17.10.2020.10.8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this work, we address the task of feature ranking for hierarchical multi-label classification (HMLC). The task of HMLC concerns problems with multiple binary variables, organized into a hierarchy of target attributes. The goal is to train a model to learn and accurately predict all of them, simultaneously. This task is receiving increasing attention from the research community, due to its wide application potential in text document classification and functional genomics. Here, we propose a group of feature ranking methods based on three established ensemble methods of predictive clustering trees: Bagging, Random Forests and Extra Trees. Predictive clustering trees are a generalization of decision trees, towards predicting structured outputs. Furthermore, we propose to use three scoring functions for calculating the feature importance values: Symbolic, Genie3 and Random Forest. We test the proposed methods on 30 benchmark HMLC datasets, show that Symbolic and Genie3 scores return relevant rankings, that all three scores outperform the HMLC-Relief ranking method and are computed in very time-efficient manner. For each scoring function, we find the most appropriate ensemble method and compare the scores to find the best one.
引用
收藏
页码:129 / 148
页数:20
相关论文
共 50 条
  • [31] Reduction strategies for hierarchical multi-label classification in protein function prediction
    Ricardo Cerri
    Rodrigo C. Barros
    André C. P. L. F. de Carvalho
    Yaochu Jin
    [J]. BMC Bioinformatics, 17
  • [32] Reduction strategies for hierarchical multi-label classification in protein function prediction
    Cerri, Ricardo
    Barros, Rodrigo C.
    de Carvalho, Andre C. P. L. F.
    Jin, Yaochu
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [33] Predictive Bi-clustering Trees for Hierarchical Multi-label Classification
    Santos, Bruna Z.
    Nakano, Felipe K.
    Cerri, Ricardo
    Vens, Celine
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT III, 2021, 12459 : 701 - 718
  • [34] Dimensionality Reduction for Hierarchical Multi-Label Classification: A Systematic Mapping Study
    Vieira, Raimundo Osvaldo
    Borges, Helyane Bronoski
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2024, 30 (01) : 130 - 150
  • [35] Hierarchical Multi-label Classification of Agricultural Pest and Disease Interrogative Questions
    Wei T.
    Ge X.
    Xiong J.
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2024, 55 (01): : 263 - 269
  • [36] Ant colony optimization based hierarchical multi-label classification algorithm
    Khan, Salabat
    Baig, Abdul Rauf
    [J]. APPLIED SOFT COMPUTING, 2017, 55 : 462 - 479
  • [37] The advances in multi-label classification
    Chen, Shijun
    Gao, Lin
    [J]. 2014 INTERNATIONAL CONFERENCE ON MANAGEMENT OF E-COMMERCE AND E-GOVERNMENT (ICMECG), 2014, : 240 - 245
  • [38] Dealing with Imbalanceness in Hierarchical Multi-Label Datasets using Multi-Label Resampling Techniques
    Pereira, Rodolfo M.
    Costa, Yandre M. G.
    Silla, Carlos N., Jr.
    [J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 818 - 824
  • [39] Feature Selection for Multi-Label Learning
    Spolaor, Newton
    Monard, Maria Carolina
    Lee, Huei Diana
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 4401 - 4402
  • [40] ReliefF for Multi-label Feature Selection
    Spolaor, Newton
    Cherman, Everton Alvares
    Monard, Maria Carolina
    Lee, Huei Diana
    [J]. 2013 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2013, : 6 - 11