Prediction of flood risk levels of urban f looded points though using machine learning with unbalanced data

被引:8
|
作者
Wang, Hongfa [1 ,2 ]
Meng, Yu [1 ,2 ]
Xu, Hongshi [1 ,2 ]
Wang, Huiliang [1 ,2 ]
Guan, Xinjian [1 ,2 ]
Liu, Yuan [1 ,2 ]
Liu, Meng [1 ,2 ]
Wu, Zening [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Water Conservancy & Transportat, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Yellow River Lab, Zhengzhou 450001, Henan, Peoples R China
关键词
Flood risk; Borderline-SMOTE; K-Means; Mahalanobis distance; Genetic Algorithm; Random Forest; DECISION-MAKING; CLASSIFICATION; SMOTE;
D O I
10.1016/j.jhydrol.2024.130742
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With the emphasis on preventing urban flooding and the enhancement of rational urban development, data related to urban flooding are also collected with unbalanced sample size that is a widespread phenomenon in other world fields. The performance of the classification model is compromised by unbalanced datasets, therefore, minority-class samples, floods with higher risk, are often missing alerted or incorrectly warned. To solve this problem, a novel hybrid resampling proposal is proposed in this research proved to be effective for balancing data. First, it optimizes an imbalanced dataset by the Borderline-SMOTE algorithm. Next, alternative datasets are synthesized through under-sampling techniques, whose qualities are evaluated by using information entropy and calculated rely on the k-nearest neighbor entropy estimator. The suggested method not only makes full use of the original data information, but also avoids under-fitting due to the single under-sampling utilization. A practical application in the central area of Zhengzhou, China, combining the resampling proposal and the Random Forest classification model optimized by Genetic Algorithm, the results show that significantly better results are yielded compared without any treatment in terms of all assessment indicators (Accuracy, Recall, G -mean and F1 -score) have been improved.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Development of risk maps for flood, landslide, and soil erosion using machine learning model
    Javidan, Narges
    Kavian, Ataollah
    Conoscenti, Christian
    Jafarian, Zeinab
    Kalehhouei, Mahin
    Javidan, Raana
    NATURAL HAZARDS, 2024, 120 (13) : 11987 - 12010
  • [32] ANALYSIS OF ENVIRONMENTAL FLOOD PREDICTION USING SOPHISTICATED MACHINE LEARNING ALGORITHM WITH IMPLEMENTATION IN SOCIAL MEDIA
    Rajeshkannan, C.
    Kogilavani, S., V
    JOURNAL OF ENVIRONMENTAL PROTECTION AND ECOLOGY, 2020, 21 (05): : 1837 - 1849
  • [33] Modeling rules of regional flash flood susceptibility prediction using different machine learning models
    Chen, Yuguo
    Zhang, Xinyi
    Yang, Kejun
    Zeng, Shiyi
    Hong, Anyu
    FRONTIERS IN EARTH SCIENCE, 2023, 11
  • [34] Prediction of Personal Cardiovascular Risk using Machine Learning for Smartphone Applications
    Seto, Edmund
    Gravina, Raffaele
    Kim, Jenna
    Lin, Shuhao
    Ferrara, Giannina
    Hua, Jenna
    PROCEEDINGS OF THE 2020 IEEE INTERNATIONAL CONFERENCE ON HUMAN-MACHINE SYSTEMS (ICHMS), 2020, : 405 - 410
  • [35] Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis
    Asri, Hiba
    Mousannif, Hajar
    Al Moatassime, Hassan
    Noel, Thomas
    7TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2016) / THE 6TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT-2016) / AFFILIATED WORKSHOPS, 2016, 83 : 1064 - 1069
  • [36] Cropland prediction using remote sensing, ancillary data, and machine learning
    Katal, Nitish
    Hooda, Nishtha
    Sharma, Ashish
    Sharma, Bhisham
    JOURNAL OF APPLIED REMOTE SENSING, 2023, 17 (02)
  • [37] Suicidal ideation prediction in twitter data using machine learning techniques
    Kumar, E. Rajesh
    Rao, K. V. S. N. Rama
    Nayak, Soumya Ranjan
    Chandra, Ramesh
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (01) : 117 - 125
  • [38] Investigation of Machine Learning Techniques for Disruption Prediction Using JET Data
    Croonen, Joost
    Amaya, Jorge
    Lapenta, Giovanni
    PLASMA, 2023, 6 (01) : 89 - 102
  • [39] Prediction of Thermogravimetric Data for Asphaltenes Extracted from Deasphalted Oil Using Machine Learning Techniques
    Sivaramakrishnan, Kaushik
    Tannous, Joy H.
    Chandrasekaran, Vignesh
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2023, 62 (43) : 17787 - 17804
  • [40] Data processing pipeline for cardiogenic shock prediction using machine learning
    Jajcay, Nikola
    Bezak, Branislav
    Segev, Amitai
    Matetzky, Shlomi
    Jankova, Jana
    Spartalis, Michael
    El Tahlawi, Mohammad
    Guerra, Federico
    Friebel, Julian
    Thevathasan, Tharusan
    Berta, Imrich
    Poelzl, Leo
    Naegele, Felix
    Pogran, Edita
    Cader, F. Aaysha
    Jarakovic, Milana
    Gollmann-Tepekoeylue, Can
    Kollarova, Marta
    Petrikova, Katarina
    Tica, Otilia
    Krychtiuk, Konstantin A.
    Tavazzi, Guido
    Skurk, Carsten
    Huber, Kurt
    Boehm, Allan
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2023, 10