Targeting predictors in random forest regression

被引:55
|
作者
Borup, Daniel [1 ,2 ,3 ]
Christensen, Bent Jesper [1 ,3 ,4 ]
Muhlbach, Nicolaj Sondergaard [1 ,5 ]
Nielsen, Mikkel Slot [1 ,6 ]
机构
[1] CREATES, Aarhus, Denmark
[2] Aarhus Univ, Dept Econ & Business Econ, Fuglesangs Alle 4, DK-8210 Aarhus V, Denmark
[3] Danish Finance Inst DFI, Aarhus, Denmark
[4] Aarhus Univ, Dale T Mortensen Ctr, Dept Econ & Business Econ, Aarhus, Denmark
[5] MIT, Dept Econ, Cambridge, MA 02139 USA
[6] Columbia Univ, Dept Stat, New York, NY 10027 USA
关键词
Random forests; Targeted predictors; High-dimensional forecasting; Weak predictors; Variable selection; VARIABLE SELECTION; CONTENT HORIZONS; LARGE NUMBER; SHRINKAGE;
D O I
10.1016/j.ijforecast.2022.02.010
中图分类号
F [经济];
学科分类号
02 ;
摘要
Random forest (RF) regression is an extremely popular tool for analyzing high -dimen-sional data. Nonetheless, its benefits may be lessened in sparse settings due to weak predictors, and a pre-estimation dimension reduction (targeting) step is required. We show that proper targeting controls the probability of placing splits along strong predictors, thus providing an important complement to RF's feature sampling. This is supported by simulations using finite representative samples. Moreover, we quantify the immediate gain from targeting in terms of the increased strength of individual trees. Macroeconomic and financial applications show that the bias-variance trade-off implied by targeting, due to increased correlation among trees in the forest, is balanced at a medium degree of targeting, selecting the best 5%-30% of commonly applied predictors. Improvements in the predictive accuracy of targeted RF relative to ordinary RF are considerable, up to 21%, occurring both in recessions and expansions, particularly at long horizons.(c) 2022 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:841 / 868
页数:28
相关论文
共 50 条
  • [31] Random forest regression prediction of solid particle Erosion in elbows
    Zahedi, Peyman
    Parvandeh, Saeid
    Asgharpour, Alireza
    McLaury, Brenton S.
    Shirazi, Siamack A.
    McKinney, Brett A.
    POWDER TECHNOLOGY, 2018, 338 : 983 - 992
  • [32] An Evolutionary Random Forest to measure the Dworak tumor regression grady
    Raets, Camille
    El Aisati, Chaimae
    De Ridder, Mark
    Sermeus, Alexandra
    Barbe, Kurt
    MEASUREMENT, 2022, 205
  • [33] Restaurant Queuing Time Prediction Using Random Forest Regression
    Xue, Yijia
    Zhang, Xiang
    2022 12TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS (ICPRS), 2022,
  • [34] Data Quality Techniques in the Internet of Things: Random Forest Regression
    Farooqi, M. Mashab
    Khattak, Hasan Ali
    Imran, Muhammad
    2018 14TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET), 2018,
  • [35] Property Valuation Using Linear Regression and Random Forest Algorithm
    Goundar, Sam
    Bhardwaj, Akashdeep
    INTERNATIONAL JOURNAL OF SYSTEM DYNAMICS APPLICATIONS, 2021, 10 (04)
  • [37] Predicting Popularity of Online Articles using Random Forest Regression
    Shreyas, R.
    Akshata, D. M.
    Mahanand, B. S.
    Shagun, B.
    Abhishek, C. M.
    2016 SECOND INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING AND INFORMATION PROCESSING (CCIP), 2016,
  • [38] Research on Power Load Forecasting Based on Random Forest Regression
    Liu, Na
    Hu, Yanzhu
    Ai, Xinbo
    2018 4TH INTERNATIONAL CONFERENCE ON ENVIRONMENTAL SCIENCE AND MATERIAL APPLICATION, 2019, 252
  • [39] Comparative Performance Analysis of Random Forest and Logistic Regression Algorithms
    Malkocoglu, Ayse Berika Varol
    Malkocoglu, Sevki Utku
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2020, : 25 - 30
  • [40] Assessment of MicroRNAs Associated with Tumor Purity by Random Forest Regression
    Nam, Dong-Yeon
    Rhee, Je-Keun
    BIOLOGY-BASEL, 2022, 11 (05):