Unmasking the sky: high-resolution PM2.5 prediction in Texas using machine learning techniques

被引:1
作者
Zhang, Kai [1 ]
Lin, Jeffrey [2 ]
Li, Yuanfei [3 ]
Sun, Yue [4 ]
Tong, Weitian [5 ]
Li, Fangyu [6 ]
Chien, Lung-Chang [7 ]
Yang, Yiping [2 ]
Su, Wei-Chung [6 ]
Tian, Hezhong [8 ,9 ]
Fu, Peng [10 ,11 ]
Qiao, Fengxiang [12 ]
Romeiko, Xiaobo Xue [1 ]
Lin, Shao [1 ]
Luo, Sheng [13 ]
Craft, Elena [14 ]
机构
[1] SUNY Albany, Sch Publ Hlth, Dept Environm Hlth Sci, Rensselaer, NY 12144 USA
[2] Univ Texas Hlth Sci Ctr Houston, Sch Publ Hlth, Dept Biostat & Data Sci, Houston, TX USA
[3] Shanghai Univ, Asian Demog Res Inst, Shanghai, Peoples R China
[4] Clark Univ, Dept Int Dev Community & Environm, Worcester, MA USA
[5] Georgia Southern Univ, Dept Comp Sci, Statesboro, GA USA
[6] Univ Texas Hlth Sci Ctr, Dept Epidemiol Human Genet & Environm Sci, Sch Publ Hlth, Houston, TX USA
[7] Univ Nevada, Sch Publ Hlth, Dept Epidemiol & Biostat, Las Vegas, NV USA
[8] Beijing Normal Univ, Sch Environm, State Key Joint Lab Environm Simulat & Pollut Cont, Beijing, Peoples R China
[9] Beijing Normal Univ, Ctr Atmospher Environm Studies, Beijing, Peoples R China
[10] Univ Illinois, Dept Plant Biol, Urbana, IL USA
[11] Harrisburg Univ, Ctr Econ Environm & Energy, Harrisburg, PA USA
[12] Texas Southern Univ, Innovat Transportat Res Inst, Houston, TX USA
[13] Duke Univ, Dept Biostat & Bioinformat, Durham, NC USA
[14] Hlth Effects Inst, Boston, MA USA
关键词
AOD; Gradient boosting; Machine learning; PM2.5; Random forest; FINE PARTICULATE MATTER; PRIVATELY INSURED POPULATION; BEIJING-TIANJIN-HEBEI; RANDOM FOREST; COMPONENTS; MODEL; AOD;
D O I
10.1038/s41370-024-00659-w
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Background Although PM2.5 (fine particulate matter with an aerodynamic diameter less than 2.5 mu m) is an air pollutant of great concern in Texas, limited regulatory monitors pose a significant challenge for decision-making and environmental studies. Objective This study aimed to predict PM2.5 concentrations at a fine spatial scale on a daily basis by using novel machine learning approaches and incorporating satellite-derived Aerosol Optical Depth (AOD) and a variety of weather and land use variables. MethodsWe compiled a comprehensive dataset in Texas from 2013 to 2017, including ground-level PM2.5 concentrations from regulatory monitors; AOD values at 1-km resolution based on images retrieved from the MODIS satellite; and weather, land-use, population density, among others. We built predictive models for each year separately to estimate PM2.5 concentrations using two machine learning approaches called gradient boosted trees and random forest. We evaluated the model prediction performance using in-sample and out-of-sample validations. Results Our predictive models demonstrate excellent in-sample model performance, as indicated by high R-2 values generated from the gradient boosting models (0.94-0.97) and random forest models (0.81-0.90). However, the out-of-sample R-2 values fall within a range of 0.52-0.75 for gradient boosting models and 0.44-0.69 for random forest models. Model performance varies slightly across years. A generally decreasing trend in predicted PM2.5 concentrations over time is observed in Eastern Texas.
引用
收藏
页码:814 / 820
页数:7
相关论文
共 31 条
[1]   A Hybrid Approach to Estimating National Scale Spatiotemporal Variability of PM2.5 in the Contiguous United States [J].
Beckerman, Bernardo S. ;
Jerrett, Michael ;
Serre, Marc ;
Martin, Randall V. ;
Lee, Seung-Jae ;
van Donkelaar, Aaron ;
Ross, Zev ;
Su, Jason ;
Burnett, Richard T. .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2013, 47 (13) :7233-7241
[2]   Contribution of low-cost sensor measurements to the prediction of PM2.5 levels: A case study in Imperial County, California, USA [J].
Bi, Jianzhao ;
Stowell, Jennifer ;
Seto, Edmund Y. W. ;
English, Paul B. ;
Al-Hamdan, Mohammad Z. ;
Kinney, Patrick L. ;
Freedman, Frank R. ;
Liu, Yang .
ENVIRONMENTAL RESEARCH, 2020, 180
[3]   Predicting Daily Urban Fine Particulate Matter Concentrations Using a Random Forest Model [J].
Brokamp, Cole ;
Jandarov, Roman ;
Hossain, Monir ;
Ryan, Patrick .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2018, 52 (07) :4173-4179
[4]   A Review on Predicting Ground PM2.5 Concentration Using Satellite Aerosol Optical Depth [J].
Chu, Yuanyuan ;
Liu, Yisi ;
Li, Xiangyu ;
Liu, Zhiyong ;
Lu, Hanson ;
Lu, Yuanan ;
Mao, Zongfu ;
Chen, Xi ;
Li, Na ;
Ren, Meng ;
Liu, Feifei ;
Tian, Liqiao ;
Zhu, Zhongmin ;
Xiang, Hao .
ATMOSPHERE, 2016, 7 (10)
[5]  
Cressie N., 1993, Statistics for spatial data, DOI DOI 10.1002/9781119115151
[6]   Traffic-related air pollution and the incidence of childhood central nervous system tumors: Texas, 2001-2009 [J].
Danysh, Heather E. ;
Mitchell, Laura E. ;
Zhang, Kai ;
Scheurer, Michael E. ;
Lupo, Philip J. .
PEDIATRIC BLOOD & CANCER, 2015, 62 (09) :1572-1578
[7]  
EPA, 2019, Integrated Science Assessment (ISA) for Particulate Matter
[8]   Using gap-filled MAIAC AOD and WRF-Chem to estimate daily PM2.5 concentrations at 1 km resolution in the Eastern United States [J].
Goldberg, Daniel L. ;
Gupta, Pawan ;
Wang, Kai ;
Jena, Chinmay ;
Zhang, Yang ;
Lu, Zifeng ;
Streets, David G. .
ATMOSPHERIC ENVIRONMENT, 2019, 199 :443-452
[9]   Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region [J].
Goncalves, Karen dos Santos ;
Winkler, Mirko S. ;
Benchimol-Barbosa, Paulo Roberto ;
de Hoogh, Kees ;
Artaxo, Paulo Eduardo ;
Hacon, Sandra de Souza ;
Schindler, Christian ;
Kunzli, Nino .
ATMOSPHERIC ENVIRONMENT, 2018, 184 :156-165
[10]   Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach [J].
Hu, Xuefei ;
Belle, Jessica H. ;
Meng, Xia ;
Wildani, Avani ;
Waller, Lance A. ;
Strickland, Matthew J. ;
Liu, Yang .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2017, 51 (12) :6936-6944