A top-down supervised learning approach to hierarchical multi-label classification in networks

被引:0
作者
Miguel Romero
Jorge Finke
Camilo Rocha
机构
[1] Pontificia Universidad Javeriana,Department of Electronics and Computer Science
来源
Applied Network Science | / 7卷
关键词
Hierarchical classification; Supervised learning; XGBoost; Top-down approach; Gene function prediction;
D O I
暂无
中图分类号
学科分类号
摘要
Node classification is the task of inferring or predicting missing node attributes from information available for other nodes in a network. This paper presents a general prediction model to hierarchical multi-label classification, where the attributes to be inferred can be specified as a strict poset. It is based on a top-down classification approach that addresses hierarchical multi-label classification with supervised learning by building a local classifier per class. The proposed model is showcased with a case study on the prediction of gene functions for Oryza sativa Japonica, a variety of rice. It is compared to the Hierarchical Binomial-Neighborhood, a probabilistic model, by evaluating both approaches in terms of prediction performance and computational cost. The results in this work support the working hypothesis that the proposed model can achieve good levels of prediction efficiency, while scaling up in relation to the state of the art.
引用
收藏
相关论文
共 107 条
  • [1] Ashburner M(2000)Gene ontology: tool for the unification of biology Nat Genet 25 25-29
  • [2] Ball CA(2012)Random search for hyper-parameter optimization J Mach Learn Res 13 281-305
  • [3] Blake JA(2002)SMOTE: synthetic minority over-sampling technique J Arti Intell Res 16 321-357
  • [4] Botstein D(2021)Network-based methods for gene function prediction Brief Funct Genomics 20 249-257
  • [5] Butler H(2019)On the interpretability of machine learning-based model for predicting hypertension BMC Med Inform Decis Mak 19 146-86
  • [6] Cherry JM(2017)Gene co-expression network reconstruction: a review on computational methods for inferring functional information from plant-based expression data Plant Biotechnol Rep 11 71-338
  • [7] Davis AP(2019)The Gene Ontology Resource: 20 years and still GOing strong Nucleic Acids Research 47 330-17
  • [8] Dolinski K(2008)Integration of relational and hierarchical network information for protein function prediction BMC Bioinform 9 350-3
  • [9] Dwight SS(2006)Oryzabase. An integrated biological and genome information database for rice Plant Physiol 140 12-252
  • [10] Eppig JT(2020)Automatic gene function prediction in the 2020’s Genes 11 1264-193