BIAS-CORRECTED QUANTILE REGRESSION FORESTS FOR HIGH-DIMENSIONAL DATA

被引:0
|
作者
Nguyen Thanh Tung [1 ,4 ]
Huang, Joshua Zhexue [1 ,2 ]
Thuy Thi Nguyen [3 ]
Khan, Imran [1 ]
机构
[1] Chinese Acad Sci, SIAT, Shenzhen Key Lab High Performance Data Min, Shenzhen 518055, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[3] Hanoi Univ Agr Vietnam, Hanoi, Vietnam
[4] Water Resources Univ, Hanoi, Vietnam
关键词
Bias Correction; Quantile Regression Forests; High-Dimensional Data; Random Forests; Data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Quantile Regression Forest (QRF), a nonparametric regression method based on the random forests, has been proved to perform well in terms of prediction accuracy, especially for non-Gaussian conditional distributions. However, the method may have two kinds of bias when solving regression problems: bias in the feature selection stage and bias in solving the regression problem. In this paper, we propose a new bias-correction algorithm that uses bias correction based on the QRF. To correct the first kind of bias, we propose a new scheme for feature sampling that allows to select good features for growing trees. The first level QRF is built based on this. For the second kind of bias, the residual term of the first level QRF model is used as the response feature to train the second level QRF model for bias correction. The second level model is then used to compute bias-corrected predictions. In our experiments, the proposed algorithm dramatically reduces prediction errors and outperforms most of the existing regression random forests models for both synthetic and well-known real-world data sets.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [41] A bias-corrected estimator in multiple imputation for missing data
    Tomita, Hiroaki
    Fujisawa, Hironori
    Henmi, Masayuki
    STATISTICS IN MEDICINE, 2018, 37 (23) : 3373 - 3386
  • [42] Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data
    Hou, Zhaohan
    Ma, Wei
    Wang, Lei
    TEST, 2023, 32 (04) : 1230 - 1250
  • [43] Variable Selection via SCAD-Penalized Quantile Regression for High-Dimensional Count Data
    Khan, Dost Muhammad
    Yaqoob, Anum
    Iqbal, Nadeem
    Wahid, Abdul
    Khalil, Umair
    Khan, Mukhtaj
    Abd Rahman, Mohd Amiruddin
    Mustafa, Mohd Shafie
    Khan, Zardad
    IEEE ACCESS, 2019, 7 : 153205 - 153216
  • [44] Sequential change point detection for high-dimensional data using nonconvex penalized quantile regression
    Ratnasingam, Suthakaran
    Ning, Wei
    BIOMETRICAL JOURNAL, 2021, 63 (03) : 575 - 598
  • [45] Random forests for high-dimensional longitudinal data
    Capitaine, Louis
    Genuer, Robin
    Thiebaut, Rodolphe
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (01) : 166 - 184
  • [46] Bias-corrected AIC for selecting variables in multinomial logistic regression models
    Yanagihara, Hirokazu
    Kamo, Ken-ichi
    Imori, Shinpei
    Satoh, Kenichi
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2012, 436 (11) : 4329 - 4341
  • [47] Screen then select: a strategy for correlated predictors in high-dimensional quantile regression
    Jiang, Xuejun
    Liang, Yakun
    Wang, Haofeng
    STATISTICS AND COMPUTING, 2024, 34 (03)
  • [48] HIGH-DIMENSIONAL LATENT PANEL QUANTILE REGRESSION WITH AN APPLICATION TO ASSET PRICING
    Belloni, Alexandre
    Chen, Mingli
    Padilla, Oscar Hernan Madrid
    Wang, Zixuan
    ANNALS OF STATISTICS, 2023, 51 (01): : 96 - 121
  • [49] Adaptive penalized quantile regression for high dimensional data
    Zheng, Qi
    Gallagher, Colin
    Kulasekera, K. B.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2013, 143 (06) : 1029 - 1038
  • [50] High-resolution bias-corrected precipitation data over South Siberia, Russia
    Voropay, Nadezhda
    Ryazanova, Anna
    Dyukarev, Egor
    ATMOSPHERIC RESEARCH, 2021, 254