Comparison of data-driven methods for linking extreme precipitation events to local and large-scale meteorological variables

被引:0
|
作者
Nafsika Antoniadou
Hjalte Jomo Danielsen Sørup
Jonas Wied Pedersen
Ida Bülow Gregersen
Torben Schmith
Karsten Arnbjerg-Nielsen
机构
[1] Technical University of Denmark,Department of Environmental and Resource Engineering, Climate and Monitoring
[2] Danish Meteorological Institute,National Centre for Climate Research
[3] Rambøll Denmark A/S,Department of Climate Adaptation and Green Infrastructure
关键词
Extreme precipitation; Meteorological drivers; Machine learning; Logistic regression; ROC curve;
D O I
暂无
中图分类号
学科分类号
摘要
Extreme precipitation events can lead to severe negative consequences for society, the economy, and the environment. It is therefore crucial to understand when such events occur. In the literature, there are a vast number of methods for analyzing their connection to meteorological drivers. However, there has been recent interest in using machine learning methods instead of classic statistical models. While a few studies in climate research have compared the performance of these two approaches, their conclusions are inconsistent. To determine whether an extreme event occurred locally, we trained models using logistic regression and three commonly used supervised machine learning algorithms tailored for discrete outcomes: random forests, neural networks, and support vector machines. We used five explanatory variables (geopotential height at 500 hPa, convective available potential energy, total column water, sea surface temperature, and air surface temperature) from ERA5, and local data from the Danish Meteorological Institute. During the variable selection process, we found that convective available potential energy has the strongest relationship with extreme events. Our results showed that logistic regression performs similarly to more complex machine learning algorithms regarding discrimination as measured by the area under the receiver operating characteristic curve (ROC AUC) and other performance metrics specialized for unbalanced datasets. Specifically, the ROC AUC for logistic regression was 0.86, while the best-performing machine learning algorithm achieved a ROC AUC of 0.87. This study emphasizes the value of comparing machine learning and classical regression modeling, especially when employing a limited set of well-established explanatory variables.
引用
收藏
页码:4337 / 4357
页数:20
相关论文
共 50 条
  • [1] Comparison of data-driven methods for linking extreme precipitation events to local and large-scale meteorological variables
    Antoniadou, Nafsika
    Sorup, Hjalte Jomo Danielsen
    Pedersen, Jonas Wied
    Gregersen, Ida Buelow
    Schmith, Torben
    Arnbjerg-Nielsen, Karsten
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023, 37 (11) : 4337 - 4357
  • [2] A Dynamical and Statistical Characterization of US Extreme Precipitation Events and Their Associated Large-Scale Meteorological Patterns
    Zhao, Siyu
    Deng, Yi
    Black, Robert X.
    JOURNAL OF CLIMATE, 2017, 30 (04) : 1307 - 1326
  • [3] North American extreme precipitation events and related large-scale meteorological patterns: a review of statistical methods, dynamics, modeling, and trends
    Barlow, Mathew
    Gutowski, William J.
    Gyakum, John R.
    Katz, Richard W.
    Lim, Young-Kwon
    Schumacher, Russ S.
    Wehner, Michael F.
    Agel, Laurie
    Bosilovich, Michael
    Collow, Allison
    Gershunov, Alexander
    Grotjahn, Richard
    Leung, Ruby
    Milrad, Shawn
    Min, Seung-Ki
    CLIMATE DYNAMICS, 2019, 53 (11) : 6835 - 6875
  • [4] North American extreme precipitation events and related large-scale meteorological patterns: a review of statistical methods, dynamics, modeling, and trends
    Mathew Barlow
    William J. Gutowski
    John R. Gyakum
    Richard W. Katz
    Young-Kwon Lim
    Russ S. Schumacher
    Michael F. Wehner
    Laurie Agel
    Michael Bosilovich
    Allison Collow
    Alexander Gershunov
    Richard Grotjahn
    Ruby Leung
    Shawn Milrad
    Seung-Ki Min
    Climate Dynamics, 2019, 53 : 6835 - 6875
  • [5] A Data-driven Mechanism for Large-scale Data Distribution
    Shi Peichang
    Li Yiying
    Ding Bo
    Jiang Longquan
    Liu Hui
    Zhang Jie
    2016 WORLD AUTOMATION CONGRESS (WAC), 2016,
  • [6] Data-driven Authoring of Large-scale Ecosystems
    Kapp, Konrad
    Gain, James
    Guerin, Eric
    Galin, Eric
    Peytavie, Adrien
    ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
  • [7] Identification of large-scale meteorological patterns associated with extreme precipitation in the US northeast
    Agel, Laurie
    Barlow, Mathew
    Feldstein, Steven B.
    Gutowski, William J., Jr.
    CLIMATE DYNAMICS, 2018, 50 (5-6) : 1819 - 1839
  • [8] Identification of large-scale meteorological patterns associated with extreme precipitation in the US northeast
    Laurie Agel
    Mathew Barlow
    Steven B. Feldstein
    William J. Gutowski
    Climate Dynamics, 2018, 50 : 1819 - 1839
  • [9] Large-scale Data-driven Segmentation of Banking Customers
    Hossain, Md Monir
    Sebestyen, Mark
    Mayank, Dhruv
    Ardakanian, Omid
    Khazaei, Hamzeh
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4392 - 4401
  • [10] Data-driven realistic animation of large-scale forest
    School of Computer Science, Wuhan University, Wuhan 430079, China
    不详
    不详
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2008, 20 (08): : 1015 - 1022