Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [11] Research of Medical High-dimensional Imbalanced Data Classification-Ensemble Feature Selection Algorithm with Random Forest
    Zhu, Min
    Su, Bo
    Ning, Gangmin
    2017 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2017, : 273 - 277
  • [12] Random forest Granger causality for detection of effective brain connectivity using high-dimensional data
    Furqan, Mohammad Shaheryar
    Siyal, Mohammad Yakoob
    JOURNAL OF INTEGRATIVE NEUROSCIENCE, 2016, 15 (01) : 55 - 66
  • [13] Random Forest for Gene Selection and Microarray Data Classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    KNOWLEDGE TECHNOLOGY, 2012, 295 : 174 - 183
  • [14] The Visualization of E-commerce High-dimensional Data Based on Random Forest
    Zhu Xianwen
    Yin Hongtan
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (01): : 987 - 991
  • [15] Stability of feature selection in classification issues for high-dimensional correlated data
    Perthame, Emeline
    Friguet, Chloe
    Causeur, David
    STATISTICS AND COMPUTING, 2016, 26 (04) : 783 - 796
  • [16] Sparse Bayesian variable selection for classifying high-dimensional data
    Yang, Aijun
    Lian, Heng
    Jiang, Xuejun
    Liu, Pengfei
    STATISTICS AND ITS INTERFACE, 2018, 11 (02) : 385 - 395
  • [17] MixDir: Scalable Bayesian Clustering for High-Dimensional Categorical Data
    Ahlmann-Eltze, Constantin
    Yau, Christopher
    2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 526 - 539
  • [18] Bayesian Conditional Tensor Factorizations for High-Dimensional Classification
    Yang, Yun
    Dunson, David B.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (514) : 656 - 669
  • [19] Knowledge-Guided Bayesian Support Vector Machine for High-Dimensional Data with Application to Analysis of Genomics Data
    Sun, Wenli
    Chang, Changgee
    Zhao, Yize
    Long, Qi
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1484 - 1493
  • [20] ForestSubtype: a cancer subtype identifying approach based on high-dimensional genomic data and a parallel random forest
    Luo, Junwei
    Feng, Yading
    Wu, Xuyang
    Li, Ruimin
    Shi, Jiawei
    Chang, Wenjing
    Wang, Junfeng
    BMC BIOINFORMATICS, 2023, 24 (01)