Water quality is a critical aspect of environmental health, human well-being, and economic stability. Specially in places where it is scarce, it is when is more crucial to have high-quality water. In Mexico, the classification system does not identify the quality of water in terms of potability, it only identifies if it is contaminated by certain elements. Therefore, in this work, we put to test a hypothesis to determine if it is possible to classify the quality of water based on existing measurements. To do this, we use k - means Clustering, followed by techniques like Support Vector Classification (SVC), Random Forest (RF) and Extreme Gradient Boosting (XGBoost). This allows us to create a system to identify what type of contaminant is there in the water with an accuracy above 95% starting off with unlabeled data. The results, however, show that data collected by Mexican authorities is scarce and poorly maintained, making it an imperfect source of information. Nonetheless, important insights are extracted, such as the evolution of contamination over time, which reveals how contamination-free water has become more scarce, even reaching a point where there are no samples of high-quality water in recent years.