Air contamination stands as a formidable obstacle to China's social-economic progress, significantly impacting citizen well-being and living conditions. The endeavor to forecast air quality accurately, which carries substantial societal implications, has become increasingly crucial particularly within Chinese urban agglomeration. Recent deep learning approaches have significantly enhanced the modeling of complex spatial-temporal correlations within air quality data, primarily by treating the data as a dispersion process. Unfortunately, a substantial portion of previous research has neglected the impact of heterogeneity in air quality data within urban agglomeration on the predictions. Moreover, these studies haven't adequately modeled and disentangled air quality data specific to individual cities and that originating from other cities within the urban agglomeration. To improve forecasting performance, we propose a novel forecasting model predicated on the Gaussian Decoupled Representation Extractor, engineered to forecast air quality within urban agglomeration. The model, using a data-driven approach, segregates the air-quality time series by quantifying and decoding the heterogeneous constituents of time, location, and origin variables embedded in air-quality data. It incorporates a dual attention mechanism to apprehend the spatial-temporal dependencies to air-quality. The efficacy of the model is corroborated using a data set from urban agglomeration in Hunan Province. Through extensive experiments, it is demonstrated that compared with the baselines, our model has shown an improvement, with a reduction of 11.2% in MAE and 8.97% in RMSE respectively.