Learning Hierarchical Multi-label Classification Trees from Network Data

被引:0
|
作者
Stojanova, Daniela [1 ]
Ceci, Michelangelo [2 ]
Malerba, Donato [2 ]
Dzeroski, Saso [1 ,3 ,4 ]
机构
[1] Jozef Stefan Inst, Dept Knowledge Technol, Ljubljana, Slovenia
[2] Univ Bari, Dipartimento Informat, Bari, Italy
[3] Jozef Stefan Int Postgrad Sch, Ljubljana, Slovenia
[4] COE, Integrated Approaches Chem & Biol Proteins, Proteins, Slovakia
来源
DISCOVERY SCIENCE | 2013年 / 8140卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an algorithm for hierarchical multi-label classification (HMC) in a network context. It is able to classify instances that may belong to multiple classes at the same time and consider the hierarchical organization of the classes. It assumes that the instances are placed in a network and uses information on the network connections during the learning of the predictive model. Many real world prediction problems have classes that are organized hierarchically and instances that can have pairwise connections. One example is web document classification, where topics (classes) are typically organized into a hierarchy and documents are connected by hyperlinks. Another example, which is considered in this paper, is gene/protein function prediction, where genes/proteins are connected and form protein-to-protein interaction (PPI) networks. Network datasets are characterized by a form of autocorrelation, where the value of a variable at a given node depends on the values of variables at the nodes it is connected with. Combining the hierarchical multi-label classification task with network prediction is thus not trivial and requires the introduction of the new concept of network autocorrelation for HMC. The proposed algorithm is able to profitably exploit network autocorrelation when learning a tree-based prediction model for HMC. The learned model is in the form of a Predictive Clustering Tree (PCT) and predicts multiple (hierarchically organized) labels at the leaves. Experiments show the effectiveness of the proposed approach for different problems of gene function prediction, considering different PPI networks. The results show that different networks introduce different benefits in different problems of gene function prediction.
引用
收藏
页码:233 / 248
页数:16
相关论文
共 50 条
  • [1] Decision trees for hierarchical multi-label classification
    Vens, Celine
    Struyf, Jan
    Schietgat, Leander
    Dzeroski, Saso
    Blockeel, Hendrik
    MACHINE LEARNING, 2008, 73 (02) : 185 - 214
  • [2] Decision trees for hierarchical multi-label classification
    Celine Vens
    Jan Struyf
    Leander Schietgat
    Sašo Džeroski
    Hendrik Blockeel
    Machine Learning, 2008, 73 : 185 - 214
  • [3] Active learning for hierarchical multi-label classification
    Nakano, Felipe Kenji
    Cerri, Ricardo
    Vens, Celine
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (05) : 1496 - 1530
  • [4] Active learning for hierarchical multi-label classification
    Felipe Kenji Nakano
    Ricardo Cerri
    Celine Vens
    Data Mining and Knowledge Discovery, 2020, 34 : 1496 - 1530
  • [5] Option Predictive Clustering Trees for Hierarchical Multi-label Classification
    Perdih, Tomaz Stepisnik
    Osojnik, Aljaz
    Dzeroski, Sao
    Kocev, Dragi
    DISCOVERY SCIENCE, DS 2017, 2017, 10558 : 116 - 123
  • [6] A Capsule Network for Hierarchical Multi-label Image Classification
    Noor, Khondaker Tasrif
    Robles-Kelly, Antonio
    Kusy, Brano
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 163 - 172
  • [7] Hierarchical contrastive learning for multi-label text classification
    Wei Zhang
    Yun Jiang
    Yun Fang
    Shuai Pan
    Scientific Reports, 15 (1)
  • [8] Hierarchical Transfer Learning for Multi-label Text Classification
    Banerjee, Siddhartha
    Akkaya, Cem
    Perez-Sorrosal, Francisco
    Tsioutsiouliklis, Kostas
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 6295 - 6300
  • [9] A Hierarchical Label Network for Multi-label EuroVoc Classification of Legislative Contents
    Caled, Danielle
    Won, Miguel
    Martins, Bruno
    Silva, Mario J.
    DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2019, 2019, 11799 : 238 - 252
  • [10] Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction
    Daniela Stojanova
    Michelangelo Ceci
    Donato Malerba
    Saso Dzeroski
    BMC Bioinformatics, 14