Unmasking the sky: high-resolution PM2.5 prediction in Texas using machine learning techniques

被引:0
作者
Zhang, Kai [1 ]
Lin, Jeffrey [2 ]
Li, Yuanfei [3 ]
Sun, Yue [4 ]
Tong, Weitian [5 ]
Li, Fangyu [6 ]
Chien, Lung-Chang [7 ]
Yang, Yiping [2 ]
Su, Wei-Chung [6 ]
Tian, Hezhong [8 ,9 ]
Fu, Peng [10 ,11 ]
Qiao, Fengxiang [12 ]
Romeiko, Xiaobo Xue [1 ]
Lin, Shao [1 ]
Luo, Sheng [13 ]
Craft, Elena [14 ]
机构
[1] SUNY Albany, Sch Publ Hlth, Dept Environm Hlth Sci, Rensselaer, NY 12144 USA
[2] Univ Texas Hlth Sci Ctr Houston, Sch Publ Hlth, Dept Biostat & Data Sci, Houston, TX USA
[3] Shanghai Univ, Asian Demog Res Inst, Shanghai, Peoples R China
[4] Clark Univ, Dept Int Dev Community & Environm, Worcester, MA USA
[5] Georgia Southern Univ, Dept Comp Sci, Statesboro, GA USA
[6] Univ Texas Hlth Sci Ctr, Dept Epidemiol Human Genet & Environm Sci, Sch Publ Hlth, Houston, TX USA
[7] Univ Nevada, Sch Publ Hlth, Dept Epidemiol & Biostat, Las Vegas, NV USA
[8] Beijing Normal Univ, Sch Environm, State Key Joint Lab Environm Simulat & Pollut Cont, Beijing, Peoples R China
[9] Beijing Normal Univ, Ctr Atmospher Environm Studies, Beijing, Peoples R China
[10] Univ Illinois, Dept Plant Biol, Urbana, IL USA
[11] Harrisburg Univ, Ctr Econ Environm & Energy, Harrisburg, PA USA
[12] Texas Southern Univ, Innovat Transportat Res Inst, Houston, TX USA
[13] Duke Univ, Dept Biostat & Bioinformat, Durham, NC USA
[14] Hlth Effects Inst, Boston, MA USA
关键词
AOD; Gradient boosting; Machine learning; PM2.5; Random forest; FINE PARTICULATE MATTER; PRIVATELY INSURED POPULATION; BEIJING-TIANJIN-HEBEI; RANDOM FOREST; COMPONENTS; MODEL; AOD;
D O I
10.1038/s41370-024-00659-w
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Background Although PM2.5 (fine particulate matter with an aerodynamic diameter less than 2.5 mu m) is an air pollutant of great concern in Texas, limited regulatory monitors pose a significant challenge for decision-making and environmental studies. Objective This study aimed to predict PM2.5 concentrations at a fine spatial scale on a daily basis by using novel machine learning approaches and incorporating satellite-derived Aerosol Optical Depth (AOD) and a variety of weather and land use variables. MethodsWe compiled a comprehensive dataset in Texas from 2013 to 2017, including ground-level PM2.5 concentrations from regulatory monitors; AOD values at 1-km resolution based on images retrieved from the MODIS satellite; and weather, land-use, population density, among others. We built predictive models for each year separately to estimate PM2.5 concentrations using two machine learning approaches called gradient boosted trees and random forest. We evaluated the model prediction performance using in-sample and out-of-sample validations. Results Our predictive models demonstrate excellent in-sample model performance, as indicated by high R-2 values generated from the gradient boosting models (0.94-0.97) and random forest models (0.81-0.90). However, the out-of-sample R-2 values fall within a range of 0.52-0.75 for gradient boosting models and 0.44-0.69 for random forest models. Model performance varies slightly across years. A generally decreasing trend in predicted PM2.5 concentrations over time is observed in Eastern Texas.
引用
收藏
页码:814 / 820
页数:7
相关论文
共 31 条
  • [1] A Hybrid Approach to Estimating National Scale Spatiotemporal Variability of PM2.5 in the Contiguous United States
    Beckerman, Bernardo S.
    Jerrett, Michael
    Serre, Marc
    Martin, Randall V.
    Lee, Seung-Jae
    van Donkelaar, Aaron
    Ross, Zev
    Su, Jason
    Burnett, Richard T.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2013, 47 (13) : 7233 - 7241
  • [2] Contribution of low-cost sensor measurements to the prediction of PM2.5 levels: A case study in Imperial County, California, USA
    Bi, Jianzhao
    Stowell, Jennifer
    Seto, Edmund Y. W.
    English, Paul B.
    Al-Hamdan, Mohammad Z.
    Kinney, Patrick L.
    Freedman, Frank R.
    Liu, Yang
    [J]. ENVIRONMENTAL RESEARCH, 2020, 180
  • [3] Predicting Daily Urban Fine Particulate Matter Concentrations Using a Random Forest Model
    Brokamp, Cole
    Jandarov, Roman
    Hossain, Monir
    Ryan, Patrick
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2018, 52 (07) : 4173 - 4179
  • [4] A Review on Predicting Ground PM2.5 Concentration Using Satellite Aerosol Optical Depth
    Chu, Yuanyuan
    Liu, Yisi
    Li, Xiangyu
    Liu, Zhiyong
    Lu, Hanson
    Lu, Yuanan
    Mao, Zongfu
    Chen, Xi
    Li, Na
    Ren, Meng
    Liu, Feifei
    Tian, Liqiao
    Zhu, Zhongmin
    Xiang, Hao
    [J]. ATMOSPHERE, 2016, 7 (10)
  • [5] Cressie N., 2015, STAT SPATIAL DATA, DOI DOI 10.1002/9781119115151
  • [6] Traffic-related air pollution and the incidence of childhood central nervous system tumors: Texas, 2001-2009
    Danysh, Heather E.
    Mitchell, Laura E.
    Zhang, Kai
    Scheurer, Michael E.
    Lupo, Philip J.
    [J]. PEDIATRIC BLOOD & CANCER, 2015, 62 (09) : 1572 - 1578
  • [7] EPA U.S, 2019, INTEGRATED SCI ASSES
  • [8] Using gap-filled MAIAC AOD and WRF-Chem to estimate daily PM2.5 concentrations at 1 km resolution in the Eastern United States
    Goldberg, Daniel L.
    Gupta, Pawan
    Wang, Kai
    Jena, Chinmay
    Zhang, Yang
    Lu, Zifeng
    Streets, David G.
    [J]. ATMOSPHERIC ENVIRONMENT, 2019, 199 : 443 - 452
  • [9] Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region
    Goncalves, Karen dos Santos
    Winkler, Mirko S.
    Benchimol-Barbosa, Paulo Roberto
    de Hoogh, Kees
    Artaxo, Paulo Eduardo
    Hacon, Sandra de Souza
    Schindler, Christian
    Kunzli, Nino
    [J]. ATMOSPHERIC ENVIRONMENT, 2018, 184 : 156 - 165
  • [10] Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach
    Hu, Xuefei
    Belle, Jessica H.
    Meng, Xia
    Wildani, Avani
    Waller, Lance A.
    Strickland, Matthew J.
    Liu, Yang
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2017, 51 (12) : 6936 - 6944