Comparison of data-driven methods for linking extreme precipitation events to local and large-scale meteorological variables

被引:0
|
作者
Nafsika Antoniadou
Hjalte Jomo Danielsen Sørup
Jonas Wied Pedersen
Ida Bülow Gregersen
Torben Schmith
Karsten Arnbjerg-Nielsen
机构
[1] Technical University of Denmark,Department of Environmental and Resource Engineering, Climate and Monitoring
[2] Danish Meteorological Institute,National Centre for Climate Research
[3] Rambøll Denmark A/S,Department of Climate Adaptation and Green Infrastructure
关键词
Extreme precipitation; Meteorological drivers; Machine learning; Logistic regression; ROC curve;
D O I
暂无
中图分类号
学科分类号
摘要
Extreme precipitation events can lead to severe negative consequences for society, the economy, and the environment. It is therefore crucial to understand when such events occur. In the literature, there are a vast number of methods for analyzing their connection to meteorological drivers. However, there has been recent interest in using machine learning methods instead of classic statistical models. While a few studies in climate research have compared the performance of these two approaches, their conclusions are inconsistent. To determine whether an extreme event occurred locally, we trained models using logistic regression and three commonly used supervised machine learning algorithms tailored for discrete outcomes: random forests, neural networks, and support vector machines. We used five explanatory variables (geopotential height at 500 hPa, convective available potential energy, total column water, sea surface temperature, and air surface temperature) from ERA5, and local data from the Danish Meteorological Institute. During the variable selection process, we found that convective available potential energy has the strongest relationship with extreme events. Our results showed that logistic regression performs similarly to more complex machine learning algorithms regarding discrimination as measured by the area under the receiver operating characteristic curve (ROC AUC) and other performance metrics specialized for unbalanced datasets. Specifically, the ROC AUC for logistic regression was 0.86, while the best-performing machine learning algorithm achieved a ROC AUC of 0.87. This study emphasizes the value of comparing machine learning and classical regression modeling, especially when employing a limited set of well-established explanatory variables.
引用
收藏
页码:4337 / 4357
页数:20
相关论文
共 50 条
  • [41] Partitioning of Large-Scale and Local-Scale Precipitation Events by Means of Spatio-Temporal Precipitation Regimes on Corsica
    Knerr, Isabel
    Trachte, Katja
    Garet, Emilie
    Huneau, Frederic
    Santoni, Sebastien
    Bendix, Joerg
    ATMOSPHERE, 2020, 11 (04)
  • [42] Data-driven robust optimization for the itinerary planning via large-scale GPS data
    Wu, Lei
    Hifi, Mhand
    KNOWLEDGE-BASED SYSTEMS, 2021, 231
  • [43] Large-Scale Data-Driven Financial Risk Modeling using Big Data Technology
    Stockinger, Kurt
    Heitz, Jonas
    Bundi, Nils
    Breymann, Wolfgang
    2018 IEEE/ACM 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING APPLICATIONS AND TECHNOLOGIES (BDCAT), 2018, : 206 - 207
  • [44] Extreme precipitation events in the Mediterranean: Spatiotemporal characteristics and connection to large-scale atmospheric flow patterns
    Mastrantonas, Nikolaos
    Herrera-Lormendez, Pedro
    Magnusson, Linus
    Pappenberger, Florian
    Matschullat, Joerg
    INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2021, 41 (04) : 2710 - 2728
  • [45] Large-Scale Predictors for Extreme Hourly Precipitation Events in Convection-Permitting Climate Simulations
    Chan, Steven C.
    Kendon, Elizabeth J.
    Roberts, Nigel
    Blenkinsop, Stephen
    Fowler, Hayley J.
    JOURNAL OF CLIMATE, 2018, 31 (06) : 2115 - 2131
  • [46] A Data-Driven Krylov Model Order Reduction for Large-Scale Dynamical Systems
    Hamadi, M. A.
    Jbilou, K.
    Ratnani, A.
    JOURNAL OF SCIENTIFIC COMPUTING, 2023, 95 (01)
  • [47] A Data-Driven Based Approach for Islanding Detection in Large-Scale Power Systems
    Golpira, Hemin
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2025, 40 (01) : 272 - 285
  • [48] Data-Driven Robust and Sparse Solutions for Large-scale Fuzzy Portfolio Optimization
    Yu, Na
    Liang, You
    Thavaneswaran, A.
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [49] Data-driven online distributed disturbance location for large-scale power grids
    Yang, Zekun
    Chen, Yu
    Zhou, Ning
    Polunchenko, Aleksey
    Liu, Yilu
    IET SMART GRID, 2019, 2 (03) : 381 - 390
  • [50] A Data-Driven Krylov Model Order Reduction for Large-Scale Dynamical Systems
    M. A. Hamadi
    K. Jbilou
    A. Ratnani
    Journal of Scientific Computing, 2023, 95