A BIG DATA GUIDE TO UNDERSTANDING CLIMATE CHANGE: The Case for Theory-Guided Data Science

被引:134
作者
Faghmous, James H. [1 ]
Kumar, Vipin [1 ]
机构
[1] Univ Minnesota Twin Cities, Dept Comp Sci & Engn, 200 Union St SE, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
REGRESSION-MODELS; ATLANTIC; DROUGHT;
D O I
10.1089/big.2014.0026
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Global climate change and its impact on human life has become one of our era's greatest challenges. Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data. This is a stark contrast from other fields such as advertising or electronic commerce where big data has been a great success story. This discrepancy stems from the complex nature of climate data as well as the scientific questions climate science brings forth. This article introduces a data science audience to the challenges and opportunities to mine large climate datasets, with an emphasis on the nuanced difference between mining climate data and traditional big data approaches. We focus on data, methods, and application challenges that must be addressed in order for big data to fulfill their promise with regard to climate science applications. More importantly, we highlight research showing that solely relying on traditional big data techniques results in dubious findings, and we instead propose a theory-guided data science paradigm that uses scientific theory to constrain both the big data techniques as well as the results-interpretation process to extract accurate insight from large climate data.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 38 条
  • [1] Anderson C., 2008, Wired, DOI DOI 10.1180/MINMAG.2008.072.1.7
  • [2] [Anonymous], 2005, P 11 ACM SIGKDD INT
  • [3] Synchronization in complex networks
    Arenas, Alex
    Diaz-Guilera, Albert
    Kurths, Jurgen
    Moreno, Yamir
    Zhou, Changsong
    [J]. PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2008, 469 (03): : 93 - 153
  • [4] Statistical significance of climate sensitivity predictors obtained by data mining
    Caldwell, Peter M.
    Bretherton, Christopher S.
    Zelinka, Mark D.
    Klein, Stephen A.
    Santer, Benjamin D.
    Sanderson, Benjamin M.
    [J]. GEOPHYSICAL RESEARCH LETTERS, 2014, 41 (05) : 1803 - 1808
  • [5] Is the number of North Atlantic tropical cyclones significantly underestimated prior to the availability of satellite observations?
    Chang, Edmund K. M.
    Guo, Yanjuan
    [J]. GEOPHYSICAL RESEARCH LETTERS, 2007, 34 (14)
  • [6] The Influence of Nonlinear Mesoscale Eddies on Near-Surface Oceanic Chlorophyll
    Chelton, Dudley B.
    Gaube, Peter
    Schlax, Michael G.
    Early, Jeffrey J.
    Samelson, Roger M.
    [J]. SCIENCE, 2011, 334 (6054) : 328 - 332
  • [7] Global observations of nonlinear mesoscale eddies
    Chelton, Dudley B.
    Schlax, Michael G.
    Samelson, Roger M.
    [J]. PROGRESS IN OCEANOGRAPHY, 2011, 91 (02) : 167 - 216
  • [8] Forecasting Fire Season Severity in South America Using Sea Surface Temperature Anomalies
    Chen, Yang
    Randerson, James T.
    Morton, Douglas C.
    DeFries, Ruth S.
    Collatz, G. James
    Kasibhatla, Prasad S.
    Giglio, Louis
    Jin, Yufang
    Marlier, Miriam E.
    [J]. SCIENCE, 2011, 334 (6057) : 787 - 791
  • [9] Dai AG, 2013, NAT CLIM CHANGE, V3, P52, DOI [10.1038/NCLIMATE1633, 10.1038/nclimate1633]
  • [10] The backbone of the climate network
    Donges, J. F.
    Zou, Y.
    Marwan, N.
    Kurths, J.
    [J]. EPL, 2009, 87 (04)