Prediction of flood risk levels of urban f looded points though using machine learning with unbalanced data

被引:8
|
作者
Wang, Hongfa [1 ,2 ]
Meng, Yu [1 ,2 ]
Xu, Hongshi [1 ,2 ]
Wang, Huiliang [1 ,2 ]
Guan, Xinjian [1 ,2 ]
Liu, Yuan [1 ,2 ]
Liu, Meng [1 ,2 ]
Wu, Zening [1 ,2 ]
机构
[1] Zhengzhou Univ, Sch Water Conservancy & Transportat, Zhengzhou 450001, Henan, Peoples R China
[2] Zhengzhou Univ, Yellow River Lab, Zhengzhou 450001, Henan, Peoples R China
关键词
Flood risk; Borderline-SMOTE; K-Means; Mahalanobis distance; Genetic Algorithm; Random Forest; DECISION-MAKING; CLASSIFICATION; SMOTE;
D O I
10.1016/j.jhydrol.2024.130742
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With the emphasis on preventing urban flooding and the enhancement of rational urban development, data related to urban flooding are also collected with unbalanced sample size that is a widespread phenomenon in other world fields. The performance of the classification model is compromised by unbalanced datasets, therefore, minority-class samples, floods with higher risk, are often missing alerted or incorrectly warned. To solve this problem, a novel hybrid resampling proposal is proposed in this research proved to be effective for balancing data. First, it optimizes an imbalanced dataset by the Borderline-SMOTE algorithm. Next, alternative datasets are synthesized through under-sampling techniques, whose qualities are evaluated by using information entropy and calculated rely on the k-nearest neighbor entropy estimator. The suggested method not only makes full use of the original data information, but also avoids under-fitting due to the single under-sampling utilization. A practical application in the central area of Zhengzhou, China, combining the resampling proposal and the Random Forest classification model optimized by Genetic Algorithm, the results show that significantly better results are yielded compared without any treatment in terms of all assessment indicators (Accuracy, Recall, G -mean and F1 -score) have been improved.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Interpretable Stroke Risk Prediction Using Machine Learning Algorithms
    Zafeiropoulos, Nikolaos
    Mavrogiorgou, Argyro
    Kleftakis, Spyridon
    Mavrogiorgos, Konstantinos
    Kiourtis, Athanasios
    Kyriazis, Dimosthenis
    INTELLIGENT SUSTAINABLE SYSTEMS, WORLDS4 2022, VOL 2, 2023, 579 : 647 - 656
  • [22] Osteoporosis Risk Prediction Using Machine Learning and Conventional Methods
    Kim, Sung Kean
    Yoo, Tae Keun
    Oh, Ein
    Kim, Deok Won
    2013 35TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2013, : 188 - 191
  • [23] A survey on diabetes risk prediction using machine learning approaches
    Firdous, Shimoo
    Wagai, Gowher A.
    Sharma, Kalpana
    JOURNAL OF FAMILY MEDICINE AND PRIMARY CARE, 2022, 11 (11) : 6929 - 6934
  • [24] Significance of Accuracy Levels in Cancer Prediction using Machine Learning Techniques
    Kumar, Ajay
    Sushil, Rama
    Tiwari, Arvind Kumar
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2019, 12 (03): : 741 - 747
  • [25] Fire Risk Prediction Analysis Using Machine Learning Techniques
    Seo, Min Song
    Castillo-Osorio, Ever Enrique
    Yoo, Hwan Hee
    SENSORS AND MATERIALS, 2023, 35 (09) : 3241 - 3255
  • [27] Prediction of an educational institute learning environment using machine learning and data mining
    Shoaib, Muhammad
    Sayed, Nasir
    Amara, Nedra
    Latif, Abdul
    Azam, Sikandar
    Muhammad, Sajjad
    EDUCATION AND INFORMATION TECHNOLOGIES, 2022, 27 (07) : 9099 - 9123
  • [28] Prediction of an educational institute learning environment using machine learning and data mining
    Muhammad Shoaib
    Nasir Sayed
    Nedra Amara
    Abdul Latif
    Sikandar Azam
    Sajjad Muhammad
    Education and Information Technologies, 2022, 27 : 9099 - 9123
  • [29] Reliable Prediction Models Based on Enriched Data for Identifying the Mode of Childbirth by Using Machine Learning Methods: Development Study
    Ullah, Zahid
    Saleem, Farrukh
    Jamjoom, Mona
    Fakieh, Bahjat
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (06)
  • [30] Flood risk assessment using machine learning, hydrodynamic modelling, and the analytic hierarchy process
    Huu Duy, Nguyen
    Tuan Pham, Le
    Xuan Linh, Nguyen
    Truong, Tran Van
    Dinh Kha, Dang
    Quang Hai, Truong
    Bui, Quang-Thanh
    JOURNAL OF HYDROINFORMATICS, 2024, 26 (08) : 1852 - 1882