Bayesian weighted random forest for classification of high-dimensional genomics data

被引：7

作者：

Olaniran, Oyebayo Ridwan ^{[1
]}

Abdullah, Mohd Asrul A. ^{[2
]}

机构：

[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria

[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia

来源：

KUWAIT JOURNAL OF SCIENCE | 2023年 / 50卷 / 04期

关键词：

Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;

D O I：

10.1016/j.kjs.2023.06.008

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.

引用

页码：477 / 484

页数：8

共 50 条

[21] A computationally fast variable importance test for random forests for high-dimensional data
Janitza, Silke
Celik, Ender
Boulesteix, Anne-Laure
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (04) : 885 - 915
[22] Knowledge-slanted random forest method for high-dimensional data and small sample size with a feature selection application for gene expression data
Cantor, Erika
Guauque-Olarte, Sandra
Leon, Roberto
Chabert, Steren
Salas, Rodrigo
BIODATA MINING, 2024, 17 (01):
[23] Random forests for high-dimensional longitudinal data
Capitaine, Louis
Genuer, Robin
Thiebaut, Rodolphe
STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (01) : 166 - 184
[24] Fuzzy Forests: Extending Random Forest Feature Selection for Correlated, High-Dimensional Data
Conn, Daniel
Ngun, Tuck
Li, Gang
Ramirez, Christina M.
JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (09):
[25] Sparse Bayesian variable selection in kernel probit model for analyzing high-dimensional data
Yang, Aijun
Tian, Yuzhu
Li, Yunxian
Lin, Jinguan
COMPUTATIONAL STATISTICS, 2020, 35 (01) : 245 - 258
[26] Ensemble of penalized logistic models for classification of high-dimensional data
Ijaz, Musarrat
Asghar, Zahid
Gul, Asma
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088
[27] A Compressive Classification Framework for High-Dimensional Data
Tabassum, Muhammad Naveed
Ollila, Esa
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2020, 1 : 177 - 186
[28] CLASSIFICATION OF HIGH-DIMENSIONAL DATA: A RANDOM-MATRIX REGULARIZED DISCRIMINANT ANALYSIS APPROACH
Ye, Bin
Liu, Peng
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (03): : 955 - 967
[29] Adaptive Bayesian density regression for high-dimensional data
Shen, Weining
Ghosal, Subhashis
BERNOULLI, 2016, 22 (01) : 396 - 420
[30] Sparse Bayesian multinomial probit regression model with correlation prior for high-dimensional data classification
Yang Aijun
Jiang Xuejun
Liu Pengfei
Lin Jinguan
STATISTICS & PROBABILITY LETTERS, 2016, 119 : 241 - 247

← 1 2 3 4 5 →