BIAS-CORRECTED QUANTILE REGRESSION FORESTS FOR HIGH-DIMENSIONAL DATA

被引:0
|
作者
Nguyen Thanh Tung [1 ,4 ]
Huang, Joshua Zhexue [1 ,2 ]
Thuy Thi Nguyen [3 ]
Khan, Imran [1 ]
机构
[1] Chinese Acad Sci, SIAT, Shenzhen Key Lab High Performance Data Min, Shenzhen 518055, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[3] Hanoi Univ Agr Vietnam, Hanoi, Vietnam
[4] Water Resources Univ, Hanoi, Vietnam
关键词
Bias Correction; Quantile Regression Forests; High-Dimensional Data; Random Forests; Data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Quantile Regression Forest (QRF), a nonparametric regression method based on the random forests, has been proved to perform well in terms of prediction accuracy, especially for non-Gaussian conditional distributions. However, the method may have two kinds of bias when solving regression problems: bias in the feature selection stage and bias in solving the regression problem. In this paper, we propose a new bias-correction algorithm that uses bias correction based on the QRF. To correct the first kind of bias, we propose a new scheme for feature sampling that allows to select good features for growing trees. The first level QRF is built based on this. For the second kind of bias, the residual term of the first level QRF model is used as the response feature to train the second level QRF model for bias correction. The second level model is then used to compute bias-corrected predictions. In our experiments, the proposed algorithm dramatically reduces prediction errors and outperforms most of the existing regression random forests models for both synthetic and well-known real-world data sets.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [31] Jackknife Bias-Corrected Generalized Regression Estimator in Survey Sampling
    Stefan, Marius
    Hidiroglou, Michael A.
    JOURNAL OF SURVEY STATISTICS AND METHODOLOGY, 2024, 12 (01) : 211 - 231
  • [32] Integrative analysis of high-dimensional quantile regression with contrasted penalization
    Ren, Panpan
    Liu, Xu
    Zhang, Xiao
    Zhan, Peng
    Qiu, Tingting
    JOURNAL OF APPLIED STATISTICS, 2024,
  • [33] Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression
    Zhou, Xingcai
    Jing, Zhaoyang
    Huang, Chao
    MATHEMATICS, 2024, 12 (05)
  • [34] High-Dimensional Spatial Quantile Function-on-Scalar Regression
    Zhang, Zhengwu
    Wang, Xiao
    Kong, Linglong
    Zhu, Hongtu
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (539) : 1563 - 1578
  • [35] HIGH-DIMENSIONAL VARYING INDEX COEFFICIENT QUANTILE REGRESSION MODEL
    Lv, Jing
    Li, Jialiang
    STATISTICA SINICA, 2022, 32 (02) : 673 - 694
  • [36] Oracle Estimation of a Change Point in High-Dimensional Quantile Regression
    Lee, Sokbae
    Liao, Yuan
    Seo, Myung Hwan
    Shin, Youngki
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (523) : 1184 - 1194
  • [37] Inference for high-dimensional varying-coefficient quantile regression
    Dai, Ran
    Kolar, Mladen
    ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (02): : 5696 - 5757
  • [38] High-dimensional quantile regression: Convolution smoothing and concave regularization
    Tan, Kean Ming
    Wang, Lan
    Zhou, Wen-Xin
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2022, 84 (01) : 205 - 233
  • [39] Distributed High-dimensional Regression Under a Quantile Loss Function
    Chen, Xi
    Liu, Weidong
    Mao, Xiaojun
    Yang, Zhuoyi
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [40] Bias-corrected estimation in dynamic panel data models
    Bun, MJG
    Carree, MA
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2005, 23 (02) : 200 - 210