Since air pollution caused by PM 2.5 (particulate matter with an aerodynamic diameter of <= 2.5 mu m) is a serious threat to human health, the accurate forecasting of PM 2.5 concentration in metropolitan areas is one of the prior conditions to reduce and eliminate the harmful impacts on human beings produced by PM2.5. In this study, we analyzed the spatiotemporal correlations between target and observation parameters relevant to air pollution forecasting and proposed a convolutional neural network (CNN) and long short-term memory (LSTM) model (also called PM predictor) for next day's daily average PM 2.5 concentration forecasting in Beijing. The proposed spatiotemporal correlations were analyzed for efficient estimation of mutual information, not only if the degrees of variations between the two spaces under consideration are similar, but also if the degrees of variations are significantly different, thereby generating a spatiotemporal feature vector. CNN provided an efficient extraction of inherent features for latent air quality and meteorological input data relevant to PM 2.5, and LSTM delivered the historical information in the time series data, thus a novel PM predictor with remarkably improved performance was constructed, compared with multi-layer perceptron (MLP) and LSTM model in overall forecasting. The air quality and meteorological data from the monitoring stations in Beijing and four surrounding cities from January 1, 2015 to December 31, 2017 were adopted as dataset. The forecasting results suggest that the proposed PM predictor is superior to other models in overall forecasting, while LSTM is better than PM predictor with slight difference in seasonal forecasting.