Prediction of flood risk levels of urban f looded points though using machine learning with unbalanced data

被引:8
|
作者
Wang, Hongfa [1 ,2 ]
Meng, Yu [1 ,2 ]
Xu, Hongshi [1 ,2 ]
Wang, Huiliang [1 ,2 ]
Guan, Xinjian [1 ,2 ]
Liu, Yuan [1 ,2 ]
Liu, Meng [1 ,2 ]
Wu, Zening [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Water Conservancy & Transportat, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Yellow River Lab, Zhengzhou 450001, Henan, Peoples R China
关键词
Flood risk; Borderline-SMOTE; K-Means; Mahalanobis distance; Genetic Algorithm; Random Forest; DECISION-MAKING; CLASSIFICATION; SMOTE;
D O I
10.1016/j.jhydrol.2024.130742
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With the emphasis on preventing urban flooding and the enhancement of rational urban development, data related to urban flooding are also collected with unbalanced sample size that is a widespread phenomenon in other world fields. The performance of the classification model is compromised by unbalanced datasets, therefore, minority-class samples, floods with higher risk, are often missing alerted or incorrectly warned. To solve this problem, a novel hybrid resampling proposal is proposed in this research proved to be effective for balancing data. First, it optimizes an imbalanced dataset by the Borderline-SMOTE algorithm. Next, alternative datasets are synthesized through under-sampling techniques, whose qualities are evaluated by using information entropy and calculated rely on the k-nearest neighbor entropy estimator. The suggested method not only makes full use of the original data information, but also avoids under-fitting due to the single under-sampling utilization. A practical application in the central area of Zhengzhou, China, combining the resampling proposal and the Random Forest classification model optimized by Genetic Algorithm, the results show that significantly better results are yielded compared without any treatment in terms of all assessment indicators (Accuracy, Recall, G -mean and F1 -score) have been improved.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Classification Model Based on Pathological Data for Kidney Diseases Prediction using Machine Learning Approach
    Elavarasi, S. Anitha
    Venkatesan, Kannan
    Murali, V
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (01) : 169 - 177
  • [42] Flash flood susceptibility prediction mapping for a road network using hybrid machine learning models
    Ha, Hang
    Luu, Chinh
    Bui, Quynh Duy
    Pham, Duy-Hoa
    Hoang, Tung
    Nguyen, Viet-Phuong
    Vu, Minh Tuan
    Pham, Binh Thai
    NATURAL HAZARDS, 2021, 109 (01) : 1247 - 1270
  • [43] Diabetes Disease Prediction using Machine Learning on Big Data of Healthcare
    Mir, Ayman
    Dhage, Sudhir N.
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [44] An application based on bioinformatics and machine learning for risk prediction of sepsis at first clinical presentation using transcriptomic data
    Shi, Songchang
    Pan, Xiaobin
    Zhang, Lihui
    Wang, Xincai
    Zhuang, Yingfeng
    Lin, Xingsheng
    Shi, Songjing
    Zheng, Jianzhang
    Lin, Wei
    FRONTIERS IN GENETICS, 2022, 13
  • [45] Using Machine Learning to Identify and Optimize Sensitive Parameters in Urban Flood Model Considering Subsurface Characteristics
    Hengxu Jin
    Yu Zhao
    Pengcheng Lu
    Shuliang Zhang
    Yiwen Chen
    Shanghua Zheng
    Zhizhou Zhu
    International Journal of Disaster Risk Science, 2024, 15 : 116 - 133
  • [46] Using Machine Learning to Identify and Optimize Sensitive Parameters in Urban Flood Model Considering Subsurface Characteristics
    Jin, Hengxu
    Zhao, Yu
    Lu, Pengcheng
    Zhang, Shuliang
    Chen, Yiwen
    Zheng, Shanghua
    Zhu, Zhizhou
    INTERNATIONAL JOURNAL OF DISASTER RISK SCIENCE, 2024, 15 (01) : 116 - 133
  • [47] The Machine Learning-Based Mapping of Urban Pluvial Flood Susceptibility in Seoul Integrating Flood Conditioning Factors and Drainage-Related Data
    Bersabe, Julieber T.
    Jun, Byong-Woon
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2025, 14 (02)
  • [48] Mapping urban temperature using crowd-sensing data and machine learning
    Zumwald, Marius
    Knusel, Benedikt
    Bresch, David N.
    Knutti, Reto
    URBAN CLIMATE, 2021, 35
  • [49] Machine learning and features for the prediction of thermal sensation and comfort using data from field surveys in Cyprus
    Pantavou, Katerina
    Delibasis, Konstantinos K.
    Nikolopoulos, Georgios K.
    INTERNATIONAL JOURNAL OF BIOMETEOROLOGY, 2022, 66 (10) : 1973 - 1984
  • [50] COVID-19 Outcome Prediction by Integrating Clinical and Metabolic Data using Machine Learning Algorithms
    Villagrana-Banuelos, Karen E.
    Maeda-Gutierrez, Valeria
    Alcala-Rmz, Vanessa
    Oropeza-Valdez, Juan J.
    Herrera-Van Oostdam, Ana S.
    Castaneda-Delgado, Julio E.
    Adrian Lopez, Jesus
    Borrego Moreno, Juan C.
    Galvan-Tejada, Carlos E.
    Galvan-Tejeda, Jorge I.
    Gamboa-Rosales, Hamurabi
    Luna-Garcia, Huizilopoztli
    Celaya-Padilla, Jose M.
    Lopez-Hernandez, Yamile
    REVISTA DE INVESTIGACION CLINICA-CLINICAL AND TRANSLATIONAL INVESTIGATION, 2022, 74 (06): : 314 - 327