Comparison of data-driven methods for linking extreme precipitation events to local and large-scale meteorological variables

被引:0
|
作者
Nafsika Antoniadou
Hjalte Jomo Danielsen Sørup
Jonas Wied Pedersen
Ida Bülow Gregersen
Torben Schmith
Karsten Arnbjerg-Nielsen
机构
[1] Technical University of Denmark,Department of Environmental and Resource Engineering, Climate and Monitoring
[2] Danish Meteorological Institute,National Centre for Climate Research
[3] Rambøll Denmark A/S,Department of Climate Adaptation and Green Infrastructure
关键词
Extreme precipitation; Meteorological drivers; Machine learning; Logistic regression; ROC curve;
D O I
暂无
中图分类号
学科分类号
摘要
Extreme precipitation events can lead to severe negative consequences for society, the economy, and the environment. It is therefore crucial to understand when such events occur. In the literature, there are a vast number of methods for analyzing their connection to meteorological drivers. However, there has been recent interest in using machine learning methods instead of classic statistical models. While a few studies in climate research have compared the performance of these two approaches, their conclusions are inconsistent. To determine whether an extreme event occurred locally, we trained models using logistic regression and three commonly used supervised machine learning algorithms tailored for discrete outcomes: random forests, neural networks, and support vector machines. We used five explanatory variables (geopotential height at 500 hPa, convective available potential energy, total column water, sea surface temperature, and air surface temperature) from ERA5, and local data from the Danish Meteorological Institute. During the variable selection process, we found that convective available potential energy has the strongest relationship with extreme events. Our results showed that logistic regression performs similarly to more complex machine learning algorithms regarding discrimination as measured by the area under the receiver operating characteristic curve (ROC AUC) and other performance metrics specialized for unbalanced datasets. Specifically, the ROC AUC for logistic regression was 0.86, while the best-performing machine learning algorithm achieved a ROC AUC of 0.87. This study emphasizes the value of comparing machine learning and classical regression modeling, especially when employing a limited set of well-established explanatory variables.
引用
收藏
页码:4337 / 4357
页数:20
相关论文
共 50 条
  • [21] Predictability of US Regional Extreme Precipitation Occurrence Based on Large-Scale Meteorological Patterns (LSMPs)
    Gao, Xiang
    Mathur, Shray
    JOURNAL OF CLIMATE, 2021, 34 (17) : 7181 - 7198
  • [22] Large-scale comparison and demonstration of continual learning for adaptive data-driven building energy prediction
    Li, Ao
    Zhang, Chong
    Xiao, Fu
    Fan, Cheng
    Deng, Yang
    Wang, Dan
    APPLIED ENERGY, 2023, 347
  • [23] Connecting local-scale heavy precipitation to large-scale meteorological patterns over Portland, Oregon
    Aragon, Christina M.
    Loikith, Paul C.
    McCullar, Nicholas
    Mandilag, Arnel
    INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2020, 40 (11) : 4763 - 4780
  • [24] Large-Scale Drivers of Tropical Extreme Precipitation Events: The Example of French Overseas Territories
    Cornillault, Erwan
    Peyrille, Philippe
    Couvreux, Fleur
    Roehrig, Romain
    GEOPHYSICAL RESEARCH LETTERS, 2024, 51 (15)
  • [25] In Situ Data-Driven Adaptive Sampling for Large-scale Simulation Data Summarization
    Biswas, Ayan
    Dutta, Soumya
    Pulido, Jesus
    Ahrens, James
    PROCEEDINGS OF IN SITU INFRASTRUCTURES FOR ENABLING EXTREME-SCALE ANALYSIS AND VISUALIZATION (ISAV 2018), 2018, : 13 - 18
  • [26] Synoptic Circulation Forcing of Large-Scale Extreme Precipitation Events Over Southeastern China
    Wu, Xinxin
    Tan, Xuezhi
    Chen, Xiaohong
    Huang, Zeqin
    JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES, 2024, 129 (17)
  • [27] Distributed data-driven optimal fault detection for large-scale systems
    Li, Linlin
    Ding, Steven X.
    Peng, Xin
    JOURNAL OF PROCESS CONTROL, 2020, 96 : 94 - 103
  • [28] A large-scale disturbance mapping ensemble through data-driven regionalization
    Bueno, Inacio Thomaz
    Hird, Jennifer
    McDermid, Gregory John
    Galvao, Lenio Soares
    Acerbi Junior, Fausto Weimar
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (12) : 3700 - 3725
  • [29] An empirical study of large-scale data-driven full waveform inversion
    Jin, Peng
    Feng, Yinan
    Feng, Shihang
    Wang, Hanchen
    Chen, Yinpeng
    Consolvo, Benjamin
    Liu, Zicheng
    Lin, Youzuo
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [30] PGD: A Large-scale Professional Go Dataset for Data-driven Analytics
    Gao, Yifan
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 284 - 291