The necessary optimization of the data lifecycle: Marine geosciences in the big data era

被引:1
作者
Lee, Taylor R. R. [1 ]
Phrampus, Benjamin J. J. [1 ]
Obelcz, Jeffrey [1 ]
机构
[1] US Naval Res Lab, Ocean Sci Div, Stennis Space Ctr, MS 39529 USA
关键词
big data; data lifecycle; database; data acquisition; data curation; data-driven; data integration; SEA-FLOOR; SEDIMENTS;
D O I
10.3389/feart.2022.1089112
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
In the marine geosciences, observations are typically acquired using research vessels to understand a given phenomenon or area of interest. Despite the plateauing of ship time and active research vessels in the last decade, the rate of marine geoscience data production has continued to increase. Simultaneously, there exists large quantities of legacy data aggregated within data repositories; however, these data are rarely curated to be both discoverable and machine-readable (i.e., accessible). This results in inefficient use, or even omission, of high-quality data, that is, both increasingly important to utilize and impractical to recollect. The proliferation of newly acquired data, and increasing importance of legacy data, has only been met with incremental evolution in the methods of data integration. This paper describes some improvements at each stage of the data lifecycle (acquisition, curation, and integration) that could align the marine geosciences better with the "big data " paradigm. We have encountered several major issues coordinating these efforts which we outline here: 1) geologic anomalies are the primary focus of data acquisition and pose difficulty in understanding the dominant (i.e., baseline) marine geology, 2) marine geoscience data are rarely curated to be accessible, and 3) aforementioned issues preclude the use of efficient integration tools that can make optimal use of data. In this paper, we discuss challenges and solutions associated with these issues to overcome these concerns in future decades of marine geoscience. The successful execution of these interconnected steps will optimize the lifecycle of marine geoscience data in the "big data " era.
引用
收藏
页数:8
相关论文
共 28 条
  • [1] Agena W. F., 1993, US GEOL SURV OPEN FI, V93, P93
  • [2] Berman F., 2014, D LIB MAG, V20, DOI DOI 10.1045/JANUARY2014-BERMAN
  • [3] A Machine Learning Tutorial for Operational Meteorology. Part I: Traditional Machine Learning
    Chase, Randy J.
    Harrison, David R.
    Burke, Amanda
    Lackmann, Gary M.
    McGovern, Amy
    [J]. WEATHER AND FORECASTING, 2022, 37 (08) : 1509 - 1529
  • [4] Counteracting Systemic Bias in the Lab, Field, and Classroom
    Cooperdock, Emily H. G.
    Chen, Christine Y.
    Guevara, Victor E.
    Metcalf, James R.
    [J]. AGU ADVANCES, 2021, 2 (01):
  • [5] Earth's surface heat flux
    Davies, J. H.
    Davies, D. R.
    [J]. SOLID EARTH, 2010, 1 (01) : 5 - 24
  • [6] Global map of solid Earth surface heat flow
    Davies, J. Huw
    [J]. GEOCHEMISTRY GEOPHYSICS GEOSYSTEMS, 2013, 14 (10) : 4608 - 4622
  • [7] PANGAEA - an information system for environmental sciences
    Diepenbroek, M
    Grobe, H
    Reinke, M
    Schindler, U
    Schlitzer, R
    Sieger, R
    Wefer, G
    [J]. COMPUTERS & GEOSCIENCES, 2002, 28 (10) : 1201 - 1210
  • [8] Organic carbon densities and accumulation rates in surface sediments of the North Sea and Skagerrak
    Diesing, Markus
    Thorsnes, Terje
    Bjarnadottir, Lilja Run
    [J]. BIOGEOSCIENCES, 2021, 18 (06) : 2139 - 2160
  • [9] Deep-sea sediments of the global ocean
    Diesing, Markus
    [J]. EARTH SYSTEM SCIENCE DATA, 2020, 12 (04) : 3367 - 3381
  • [10] Dixon M. F., 2020, MACHINE LEARNING FIN, DOI DOI 10.1007/978-3-030-41068-1