The air quality index (AQI) is a commonly employed metric for evaluating air quality across diverse locations and temporal spans. Similar to other environmental datasets, AQI data can exhibit outliers data points markedly divergent from the norm, signifying instances of exceptionally favorable or adverse air quality. This becomes crucial in identifying and comprehending severe pollution episodes with far-reaching environmental and public health implications. This study utilizes air quality data from January 1, 2014, to January 31, 2021, collected at daily intervals in Shanghai City, China, as the experimental dataset. The dataset includes daily AQI measurements, along with six pollutant concentrations: particulate matter (PM2.5 and PM10), sulfur dioxide (SO2), nitrogen dioxide (NO2), ozone (O3), and carbon monoxide (CO). Each pollutant's concentration is measured in micrograms per cubic meter (mu$$ \upmu $$g/m 3$$ {}<^>3 $$). The dataset is then preprocessed by cleaning and normalizing it before using K-means clustering to discover different patterns. A stacked ensemble machine learning model that incorporates K-means clustering, random forest (RF) and gradient boosting classifier (GBC) is developed and compared to decision tree, support vector machine, K-nearest neighbor and Naive Bayes algorithms to evaluate its performance in identifying outliers using accuracy, precision, recall, and F1-score. The stacked model outperformed all other established models based on the accuracy, precision, recall, and F1-score of 0.99, 0.99, 0.97, and 0.99, respectively. This study explores outlier detection in Shanghai's air quality index (AQI) data from January 2014 to January 2021 using a stacked ensemble machine learning model combining K-means clustering, random forest, and gradient boosting classifier. The model's performance, surpassing traditional methods like decision trees and SVMs, is evaluated through metrics like accuracy and F1-score, demonstrating its effectiveness in identifying significant pollution episodes with implications for environmental and public health. image