This work compares methods to compute confidence bands in a validation task of a vehicle single-track model. The confidence bands are computed from time series by naive method, Gaussian process regression and heteroscedastic and non-stationary Gaussian process regression. The simulation model considers the epistemic uncertainty of the vehicle mass parameter by Latin hypercube sampling. The validation procedure compares all stochastically simulated time series of the vehicle yaw rate with the confidence band of the reference data. The model is marked as valid if the yaw rate for each time step is within the confidence band of the reference data. The data was challenging due to noise and time-varying variance and smoothness. Due to required data pre-processing and the high sensitivity to noise in the reference data, the naive method has generated unusable confidence bands and cannot be recommended for similar validation tasks. Gaussian process regression solved the problem of noise sensitivity, but was not able to model the time-varying length scale of the reference data. Therefore, heteroscedastic and non-stationary Gaussian process regression is proposed to calculate accurate confidence bands of time-varying and noisy reference data for the validation of dynamic models by a confidence band approach.