Predicting carbon and water vapor fluxes using machine learning and novel feature ranking algorithms

被引:18
作者
Cui, Xia [1 ]
Goff, Thomas [2 ]
Cui, Song [3 ]
Menefee, Dorothy [4 ]
Wu, Qiang [5 ]
Rajan, Nithya [4 ]
Nair, Shyam [6 ]
Phillips, Nate [3 ]
Walker, Forbes [7 ]
机构
[1] Lanzhou Univ, Minist Educ, Coll Earth & Environm Sci, Key Lab Western Chinas Environm Syst, Lanzhou 730000, Peoples R China
[2] Middle Tennessee State Univ, Ctr Computat Sci, Murfreesboro, TN 37132 USA
[3] Middle Tennessee State Univ, Sch Agr, Murfreesboro, TN 37132 USA
[4] Texas A&M Univ, Dept Soil & Crop Sci, College Stn, TX 77843 USA
[5] Middle Tennessee State Univ, Dept Math Sci, Murfreesboro, TN 37132 USA
[6] Sam Houston State Univ, Dept Agr Sci & Engn Technol, Huntsville, TX 77341 USA
[7] Univ Tennessee, Dept Biosyst Engn & Soil Sci, Knoxville, TN 37996 USA
基金
中国国家自然科学基金;
关键词
Eddy covariance; Machine learning; Feature ranking; Support vector machine; Remote sensing; EDDY-COVARIANCE MEASUREMENTS; NET ECOSYSTEM EXCHANGE; SLICED INVERSE REGRESSION; ENERGY-BALANCE CLOSURE; DATA-DRIVEN TECHNIQUES; TERRESTRIAL EVAPOTRANSPIRATION; SPATIOTEMPORAL PATTERN; DIOXIDE EXCHANGE; FLUXNET SITES; FOREST;
D O I
10.1016/j.scitotenv.2021.145130
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Gap-filling eddy covariance flux data using quantitative approaches has increased over the past decade. Numerousmethods have been proposed previously, including look-up table approaches, parametricmethods, processbased models, and machine learning. Particularly, the REddyProc package from the Max Planck Institute for Biogeochemistry and ONEFlux package from AmeriFlux have been widely used in many studies. However, there is no consensus regarding the optimalmodel and feature selectionmethod that could be used for predicting different flux targets (Net Ecosystem Exchange, NEE; or Evapotranspiration -ET), due to the limited systematic comparative research based on the identical site-data. Here, we compared NEE and ET gap-filling/prediction performance of the least-square-based linear model, artificial neural network, random forest (RF), and support vector machine (SVM) using data obtained from four major row-crop and forage agroecosystems located in the subtropical or the climate-transition zones in the US. Additionally, we tested the impacts of different training-testing data partitioning settings, including a 10-fold time-series sequential (10FTS), a 10-fold cross validation (CV) routine with single data point (10FCV), daily (10FCVD), weekly ( 10FCVW) and monthly (10FCVM) gap length, and a 7/14-day flanking window(FW) approach; and implemented a novel Sliced Inverse Regression-based Recursive Feature Elimination algorithm (SIRRFE). We benchmarked the model performance against REddyProc and ONEFlux-produced results. Our results indicated that accurate NEE and ET prediction models could be systematically constructed using SVM/RF and only a few top informative features. The gapfilling performance of ONEFlux is generally satisfactory (R-2 = 0.39-0.71), but results from REddyProc could be very limited or even unreliable inmany cases (R-2= 0.01-0.67). Overall, SIRRFE-refined SVMmodels yielded excellent results for predicting NEE (R-2= 0.46-0.92) and ET (R-2= 0.74-0.91). Finally, the performance of various modelswas greatly affected by the types of ecosystem, predicting targets, and training algorithms; butwas insensitive towards training-testing partitioning. Our research provided more insights into constructing novel gapfilling models and understanding the underlying drivers affecting boundary layer carbon/water fluxes on an ecosystem level. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 77 条
[1]   Quantifying water and CO2 fluxes and water use efficiencies across irrigated C3 and C4 crops in a humid climate [J].
Anapalli, Saseendran S. ;
Fisher, Daniel K. ;
Reddy, Krishna N. ;
Krutz, Jason L. ;
Pinnamaneni, Srinivasa R. ;
Sui, Ruixiu .
SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 663 :338-350
[2]  
[Anonymous], 2004, Handbook of Micrometeorology: A Guide for Surface Flux Measurement and Analysis
[3]   Long term carbon dioxide exchange above a mixed forest in the Belgian Ardennes [J].
Aubinet, M ;
Chermanne, B ;
Vandenhaute, M ;
Longdoz, B ;
Yernaux, M ;
Laitat, E .
AGRICULTURAL AND FOREST METEOROLOGY, 2001, 108 (04) :293-315
[4]   Examining strategies to improve the carbon balance of corn/soybean agriculture using eddy covariance and mass balance techniques [J].
Baker, JM ;
Griffis, TJ .
AGRICULTURAL AND FOREST METEOROLOGY, 2005, 128 (3-4) :163-177
[5]  
Baldocchi D, 2001, B AM METEOROL SOC, V82, P2415, DOI 10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO
[6]  
2
[7]   Evaluating four gap-filling methods for eddy covariance measurements of evapotranspiration over hilly crop fields [J].
Boudhina, Nissaf ;
Zitouna-Chebbi, Rim ;
Mekki, Insaf ;
Jacob, Frederic ;
Ben Mechlia, Netij ;
Masmoudi, Moncef ;
Prevot, Laurent .
GEOSCIENTIFIC INSTRUMENTATION METHODS AND DATA SYSTEMS, 2018, 7 (02) :151-167
[8]   Machine learning-based microarray analyses indicate low-expression genes might collectively influence PAH disease [J].
Cui, Song ;
Wu, Qiang ;
West, James ;
Bai, Jiangping .
PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (08)
[9]   An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine [J].
Cui, Song ;
Youn, Eunseog ;
Lee, Joohyun ;
Maas, Stephan J. .
PLOS ONE, 2014, 9 (04)
[10]   Implications of agricultural transitions and urbanization for ecosystem services [J].
Cumming, Graeme S. ;
Buerkert, Andreas ;
Hoffmann, Ellen M. ;
Schlecht, Eva ;
von Cramon-Taubadel, Stephan ;
Tscharntke, Teja .
NATURE, 2014, 515 (7525) :50-57