Extreme flash flood susceptibility mapping using a novel PCA-based model stacking approach

被引:9
作者
Shojaeian, Amirreza [1 ]
Shafizadeh-Moghadam, Hossein [2 ]
Sharafati, Ahmad [1 ]
Shahabi, Himan [3 ,4 ]
机构
[1] Islamic Azad Univ, Dept Civil Engn, Sci & Res Branch, Tehran, Iran
[2] Tarbiat Modares Univ, Dept Water Engn & Management, Tehran, Iran
[3] Univ Kurdistan, Fac Nat Resources, Dept Geomorphol, Sanandaj 6617715175, Iran
[4] Silesian Tech Univ, Inst Phys, Div Geochronol & Environm Isotopes, PL-44100 Gliwice, Poland
关键词
Machine learning; Meta-model; Model integration; PCA; Karkheh Basin; REGRESSION; CLASSIFICATION; PREDICTION; NETWORKS; AREAS;
D O I
10.1016/j.asr.2024.08.004
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This study introduces an efficient methodology for model stacking, incorporating six diverse machine learning and statistical models alongside principal component analysis (PCA). The approach is applied for the flash flood susceptibility mapping within the Karkheh Basin in Iran. The selected models include random forest (RF), boosted regression trees (BRT), support vector machine (SVM), artificial neural networks (ANN), generalized additive model (GAM), and the least absolute shrinkage and selection operator (Lasso), with RF also serving as the meta-model for the stacking. The results revealed significant correlations among the predictions of the individual models, which could potentially impact the meta-model's efficacy. To address this, PCA was applied to the model predictions to generate de- correlated components as inputs for the meta-model, thereby enhancing prediction accuracy and robustness. Evaluation based on the area under the receiver operating characteristic (AUROC) curve demonstrated that the GAM outperformed all other individual models with the highest accuracy score of 0.924. In contrast, the RF and ANN models had the lowest accuracy, both registering at 0.872. However, the performance disparity across models was minimal. Notably, the PCA-based stacking approach (0.936) surpassed both traditional model stacking (0.912) and the performances of all individual models, advocating for its use in enhancing predictive accuracy. These findings endorse the PCA-stacking method over conventional stacking techniques. Nonetheless, further research across varied applications is warranted to generalize its efficacy. (c) 2024 COSPAR. Published by Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
引用
收藏
页码:5371 / 5382
页数:12
相关论文
共 71 条
[31]   A review of advances in flash flood forecasting [J].
Hapuarachchi, H. A. P. ;
Wang, Q. J. ;
Pagano, T. C. .
HYDROLOGICAL PROCESSES, 2011, 25 (18) :2771-2784
[32]   GENERALIZED ADDITIVE-MODELS - SOME APPLICATIONS [J].
HASTIE, T ;
TIBSHIRANI, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (398) :371-386
[33]   Support vector machines [J].
Hearst, MA .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (04) :18-21
[34]   Analysis of a complex of statistical variables into principal components [J].
Hotelling, H .
JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1933, 24 :417-441
[35]  
JAMAB, 1999, Comprehensive assessment of national water resources: Karkheh River Basin
[36]   Principal component analysis: a review and recent developments [J].
Jolliffe, Ian T. ;
Cadima, Jorge .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2065)
[37]  
Kuhn M., 2013, APPL PREDICTIVE MODE, V810, DOI 10.1007/978-1-4614-6849-3
[38]   Building Predictive Models in R Using the caret Package [J].
Kuhn, Max .
JOURNAL OF STATISTICAL SOFTWARE, 2008, 28 (05) :1-26
[39]  
Liaw A., 2002, R News, V2, P18
[40]   On the suitability of stacking-based ensembles in smart agriculture for evapotranspiration prediction [J].
Martin, Juan ;
Saez, Jose A. ;
Corchado, Emilio .
APPLIED SOFT COMPUTING, 2021, 108