Tyres show a strong non-linear dependence on vertical force, road roughness, wear level, temperature gradient, and slip resulting in an additional challenge in calibration, whose parameters may vary significantly with the tyre's condition. An additional challenge to identifying and modeling the multi-dimensional tyre variability lies in the low accuracy level of tyre-road interaction data presenting physical inconsistencies and outliers, thus affecting outdoor testing scenarios. Indeed, outliers, gaps, or errors in the data can compromise calibration performance, potentially leading to incorrect model identification and rendering it unsuitable for further offline and online applications. In this paper, the authors aim to optimize the process of identifying tyre parameters by applying machine learning techniques to the dataset's pre-processing with particular attention to clustering and anomaly detection algorithms. The process is split into two phases: first, different clustering algorithms are applied to the tyre data to group similar operating conditions; then, anomaly detection algorithms are applied to clustered data to recognize and remove inconsistencies. Additionally, to objectively compare the proposed data processing results, the preprocessed specifically acquired experimental data have been employed for the calibration of the reference mathematical tyre formulation, comparing the deviations of the fundamental tyre-related quantities to the previously identified tyre model, already validated in both offline and online scenarios. For the grip coefficient evaluation versus both lateral and longitudinal slip variables, the Elliptic Envelope algorithm shows to be the best anomaly detection algorithm while the One-Class Support Vector Machine technique demonstrates lower deviations for the stiffness evaluation in both longitudinal and lateral directions.