Evaluation of Machine Learning Models for Estimating PM2.5 Concentrations across Malaysia

被引:26
作者
Zaman, Nurul Amalin Fatihah Kamarul [1 ]
Kanniah, Kasturi Devi [1 ,2 ]
Kaskaoutis, Dimitris G. [3 ,4 ]
Latif, Mohd Talib [5 ]
机构
[1] Univ Teknol Malaysia, Fac Built Environm & Surveying, Trop Map Res Grp, Skudai 81310, Johor, Malaysia
[2] Univ Teknol Malaysia, Ctr Environm Sustainabil & Water Secur IPASA, Res Inst Sustainable Environm, Utm 81310, Johor, Malaysia
[3] Natl Observ Athens, Inst Environm Res & Sustainable Dev, Athens 15236, Greece
[4] Univ Crete, Dept Chem, Environm Chem Proc Lab, Iraklion 71003, Greece
[5] Univ Kebangsaan Malaysia, Fac Sci & Technol, Dept Earth Sci & Environm, Bangi 43600, Selangor, Malaysia
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 16期
关键词
PM2; 5; Himawari-8; random forest; support vector regression; air pollution; Malaysia; AEROSOL OPTICAL DEPTH; GROUND-LEVEL PM2.5; PARTICULATE MATTER; METEOROLOGICAL VARIABLES; NEXT-GENERATION; AIR-QUALITY; POLLUTANT CONCENTRATIONS; OZONE CONCENTRATIONS; PM10; CONCENTRATION; NEURAL-NETWORK;
D O I
10.3390/app11167326
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression (SVR), based on satellite AOD (aerosol optical depth) observations, ground measured air pollutants (NO2, SO2, CO, O-3) and meteorological parameters (air temperature, relative humidity, wind speed and direction). The estimated PM2.5 concentrations for a two-year period (2018-2019) are evaluated against measurements performed at 65 air-quality monitoring stations located at urban, industrial, suburban and rural sites. PM2.5 concentrations varied widely between the stations, with higher values (mean of 24.2 +/- 21.6 mu g m(-3)) at urban/industrial stations and lower (mean of 21.3 +/- 18.4 mu g m(-3)) at suburban/rural sites. Furthermore, pronounced seasonal variability in PM2.5 is recorded across Malaysia, with highest concentrations during the dry season (June-September). Seven models were developed for PM2.5 predictions, i.e., separately for urban/industrial and suburban/rural sites, for the four dominant seasons (dry, wet and two inter-monsoon), and an overall model, which displayed accuracies in the order of R-2 = 0.46-0.76. The validation analysis reveals that the RF model (R-2 = 0.53-0.76) exhibits slightly better performance than SVR, except for the overall model. This is the first study conducted in Malaysia for PM2.5 estimations at a national scale combining satellite aerosol retrievals with ground-based pollutants, meteorological factors and ML techniques. The satisfactory prediction of PM2.5 concentrations across Malaysia allows a continuous monitoring of the pollution levels at remote areas with absence of measurement networks.
引用
收藏
页数:24
相关论文
共 157 条
  • [31] Qualitative and quantitative evaluation of MODIS satellite sensor data for regional and urban scale air quality
    Engel-Cox, JA
    Holloman, CH
    Coutant, BW
    Hoff, RM
    [J]. ATMOSPHERIC ENVIRONMENT, 2004, 38 (16) : 2495 - 2509
  • [32] Toward the next generation of air quality monitoring: Particulate Matter
    Engel-Cox, Jill
    Nguyen Thi Kim Oanh
    van Donkelaar, Aaron
    Martin, Randall V.
    Zell, Erica
    [J]. ATMOSPHERIC ENVIRONMENT, 2013, 80 : 584 - 590
  • [33] Characterization of aerosols over the Indochina peninsula from satellite-surface observations during biomass burning pre-monsoon season
    Gautam, Ritesh
    Hsu, N. Christina
    Eck, Thomas F.
    Holben, Brent N.
    Janjai, Serm
    Jantarach, Treenuch
    Tsay, Si-Chee
    Lau, William K.
    [J]. ATMOSPHERIC ENVIRONMENT, 2013, 78 : 51 - 59
  • [34] Variable selection using random forests
    Genuer, Robin
    Poggi, Jean-Michel
    Tuleau-Malot, Christine
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (14) : 2225 - 2236
  • [35] Mapping wind erosion hazard with regression-based machine learning algorithms
    Gholami, Hamid
    Mohammadifar, Aliakbar
    Bui, Dieu Tien
    Collins, Adrian L.
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [36] Machine-learning algorithms for predicting land susceptibility to dust emissions: The case of the Jazmurian Basin, Iran
    Gholami, Hamid
    Mohamadifar, Aliakbar
    Sorooshian, Armin
    Jansen, John D.
    [J]. ATMOSPHERIC POLLUTION RESEARCH, 2020, 11 (08) : 1303 - 1315
  • [37] Diverse sources of aeolian sediment revealed in an arid landscape in southeastern Iran using a modified Bayesian un-mixing model
    Gholami, Hamid
    Kordestani, Mojtaba Dolat
    Li, Junran
    Telfer, Matt W.
    Fathabadi, Aboalhasan
    [J]. AEOLIAN RESEARCH, 2019, 41
  • [38] Random forest meteorological normalisation models for Swiss PM10 trend analysis
    Grange, Stuart K.
    Carslaw, David C.
    Lewis, Alastair C.
    Boleti, Eirini
    Hueglin, Christoph
    [J]. ATMOSPHERIC CHEMISTRY AND PHYSICS, 2018, 18 (09) : 6223 - 6239
  • [39] The combined effect of reduced fossil fuel consumption and increasing biomass combustion on Athens' air quality, as inferred from long term CO measurements
    Gratsea, Myrto
    Liakakou, Eleni
    Mihalopoulos, Nikos
    Adamopoulos, Anastasios
    Tsilibari, Eirini
    Gerasopoulos, Evangelos
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2017, 592 : 115 - 123
  • [40] Artificial neural network models for prediction of PM10 hourly concentrations, in the Greater Area of Athens, Greece
    Grivas, G
    Chaloulakou, A
    [J]. ATMOSPHERIC ENVIRONMENT, 2006, 40 (07) : 1216 - 1229