Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [31] High-Dimensional Bayesian Geostatistics
    Banerjee, Sudipto
    BAYESIAN ANALYSIS, 2017, 12 (02): : 583 - 614
  • [32] Penalized Gaussian Process Regression and Classification for High-Dimensional Nonlinear Data
    Yi, G.
    Shi, J. Q.
    Choi, T.
    BIOMETRICS, 2011, 67 (04) : 1285 - 1294
  • [33] High-Dimensional Bayesian Optimization via Random Projection of Manifold Subspaces
    Nguyen, Quoc-Anh Hoang
    The Hung Tran
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024, 2024, 14948 : 288 - 305
  • [34] Classification Application Based on Mutual Information and Random Forest Method for High Dimensional Data
    Kong, Qingqing
    Gong, Huili
    Ding, Xiangqian
    Hou, Ruichun
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 171 - 174
  • [35] A nonparametric Bayesian technique for high-dimensional regression
    Guha, Subharup
    Baladandayuthapani, Veerabhadran
    ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (02): : 3374 - 3424
  • [36] Bayesian penalized cumulative logit model for high-dimensional data with an ordinal response
    Zhang, Yiran
    Archer, Kellie J.
    STATISTICS IN MEDICINE, 2021, 40 (06) : 1453 - 1481
  • [37] Variable selection for high-dimensional incomplete data
    Liang, Lixing
    Zhuang, Yipeng
    Yu, Philip L. H.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 192
  • [38] Bayesian Function-on-Scalars Regression for High-Dimensional Data
    Kowal, Daniel R.
    Bourgeois, Daniel C.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2020, 29 (03) : 629 - 638
  • [39] Supervised Bayesian latent class models for high-dimensional data
    Desantis, Stacia M.
    Houseman, E. Andres
    Coull, Brent A.
    Nutt, Catherine L.
    Betensky, Rebecca A.
    STATISTICS IN MEDICINE, 2012, 31 (13) : 1342 - 1360
  • [40] Sparse Bayesian variable selection in multinomial probit regression model with application to high-dimensional data classification
    Yang Aijun
    Jiang Xuejun
    Xiang Liming
    Lin Jinguan
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (12) : 6137 - 6150