Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [41] ForestSubtype: a cancer subtype identifying approach based on high-dimensional genomic data and a parallel random forest
    Junwei Luo
    Yading Feng
    Xuyang Wu
    Ruimin Li
    Jiawei Shi
    Wenjing Chang
    Junfeng Wang
    BMC Bioinformatics, 24
  • [42] Supervised Classification of High-Dimensional Correlated Data: Application to Genomic Data
    Aboubacry Gaye
    Abdou Ka Diongue
    Seydou Nourou Sylla
    Maryam Diarra
    Amadou Diallo
    Cheikh Talla
    Cheikh Loucoubar
    Journal of Classification, 2024, 41 : 158 - 169
  • [43] Continuous Conditional Random Fields in Predicting High-Dimensional Data
    Purbarani, Sumarsih Condroayu
    Sanabila, H. R.
    Wibisono, Ari
    Jatmiko, Wisnu
    2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 427 - 432
  • [44] Supervised Classification of High-Dimensional Correlated Data: Application to Genomic Data
    Gaye, Aboubacry
    Diongue, Abdou Ka
    Sylla, Seydou Nourou
    Diarra, Maryam
    Diallo, Amadou
    Talla, Cheikh
    Loucoubar, Cheikh
    JOURNAL OF CLASSIFICATION, 2024, 41 (01) : 158 - 169
  • [45] Development of biomarker classifiers from high-dimensional data
    Baek, Songjoon
    Tsai, Chen-An
    Chen, James J.
    BRIEFINGS IN BIOINFORMATICS, 2009, 10 (05) : 537 - 546
  • [46] Penalized weighted smoothed quantile regression for high-dimensional longitudinal data
    Song, Yanan
    Han, Haohui
    Fu, Liya
    Wang, Ting
    STATISTICS IN MEDICINE, 2024, 43 (10) : 2007 - 2042
  • [47] Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics
    Fisher, Charles K.
    Mehta, Pankaj
    BIOINFORMATICS, 2015, 31 (11) : 1754 - 1761
  • [48] High-dimensional spectral data classification with nonparametric feature screening
    Li, Chuan-Quan
    Xu, Qing-Song
    JOURNAL OF CHEMOMETRICS, 2020, 34 (03)
  • [49] Bayesian Inference on High-Dimensional Multivariate Binary Responses
    Chakraborty, Antik
    Ou, Rihui
    Dunson, David B.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, : 2560 - 2571
  • [50] Hierarchical classification of microorganisms based on high-dimensional phenotypic data
    Tafintseva, Valeria
    Vigneau, Evelyne
    Shapaval, Volha
    Cariou, Veronique
    Qannari, El Mostafa
    Kohler, Achim
    JOURNAL OF BIOPHOTONICS, 2018, 11 (03)