GCN-ST-MDIR: Graph Convolutional Network-Based Spatial-Temporal Missing Air Pollution Data Pattern Identification and Recovery

被引:2
|
作者
Yu, Yangwen [1 ]
Li, Victor O. K. [1 ]
Lam, Jacqueline C. K. [1 ]
Chan, Kelvin [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Peoples R China
关键词
Air pollution; Data models; Atmospheric modeling; Monitoring; Training; Convolutional neural networks; Big Data; Air pollution data; graph convolutional network; transfer learning; automatic; missing data pattern identification; missing data pattern recovery; similarity matrix; spatial-temporal; PARTICULATE MATTER; NEURAL-NETWORK; DISEASE; QUALITY; CITIES; PM2.5; MODEL; PM10; LSTM; CNN;
D O I
10.1109/TBDATA.2023.3277710
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing data pattern identification and recovery (MDIR) is vital for accurate air pollution monitoring. To recover the missing air pollution data, GCN-ST-MDIR, a Graph Convolutional Network (GCN)-based MDIR framework, is proposed to identify daily missing data patterns and automatically select the best recovery method. GCN-ST-MDIR presents four novelties: (1) A new graph construction is developed to improve GCN data representation for MDIR using S-T similarity matrix and domain-specific knowledge (e.g., weekend/weekday). (2) A TL component is used to pre-train LSCE and ILSCE models. (3) A GCN structure outputs a selection indicator to determine the dominant missing pattern for daily input. The pre-trained data recovery model's accuracy is incorporated into the GCN loss function to penalize the wrong indicator. (4) The output of the GCN structure is used as a score to combine LSCE and ILSCE. Results show that the domain-specific S-T regularity and irregularity can be used as the prior information for both GCN and ILSCE/LSCE to enhance feature extraction. Our model considerably improves the recovery performance as compared to the baselines. GCN-ST-MDIR has achieved an accuracy of 88.48% for general missing data recovery with consecutively and sporadically missing data. GCN-ST-MDIR can be extended to many other S-T MDIR challenges.
引用
收藏
页码:1347 / 1364
页数:18
相关论文
共 33 条
  • [31] Region-Level Traffic Prediction Based on Temporal Multi-Spatial Dependence Graph Convolutional Network from GPS Data
    Yang, Haiqiang
    Zhang, Xinming
    Li, Zihan
    Cui, Jianxun
    REMOTE SENSING, 2022, 14 (02)
  • [32] Simultaneous identification of groundwater pollution source spatial-temporal characteristics and hydraulic parameters based on deep regularization neural network-hybrid heuristic algorithm
    Pan, Zidong
    Lu, Wenxi
    Chang, Zhenbo
    Wang, Han
    JOURNAL OF HYDROLOGY, 2021, 600
  • [33] Spatial-temporal Distribution and Evolution Characteristics of Air Pollution in Beijing-Tianjin-Hebei Region Based on Long-term "Ground-Satellite" Data
    Wang Y.-T.
    Yin Z.-P.
    Zheng Z.-F.
    Li J.
    Li Q.-C.
    Meng C.-L.
    Li W.
    Huanjing Kexue/Environmental Science, 2022, 43 (07): : 3508 - 3522