The Qinghai-Tibet Plateau has complex geomorphic features, which makes it difficult to interpolate surface air temperature precisely because of sparse samples and intensive spatial variations. Different interpolation methods have been developed, and they perform differently under various situations. The statistical errors of interpolation methods are determined by the population properties, the condition of the samples, and the adequacy of covariates. However, few studies have focused on optimal interpolation strategies for Qinghai-Tibet Plateau. In this study, seven typical interpolation models were used and compared. The model-based methods (e.g., ordinary kriging), design-based methods (inverse distance weight (IDW), thin plate splines (TPS), and combined methods (e.g., spatiotemporal regression kriging) were considered. Using auxiliary information, spatiotemporal ordinary kriging, and spatiotemporal stratified kriging models were built. Methods were evaluated by cross validation with mean absolute error (MAE) and root-mean-square error (RMSE). Results showed that for both of the index (RMSE, MAE), spatiotemporal kriging stratified by seasons (1.016 degrees C RMSE, 0.767 degrees C MAE) < spatiotemporal kriging stratified by climate regions (1.018 <degrees>C, 0.767 degrees C) < spatiotemporal ordinary kriging (1.022 <degrees>C, 0.774 degrees C) < spatiotemporal regression kriging (1.058 <degrees>C, 0.806 degrees C) < TPS (1.551 <degrees>C, 1.143 degrees C) < ordinary kriging (2.674 <degrees>C, 2.044 degrees C) < IDW (2.917 <degrees>C, 2.296 degrees C). In conclusion, under the condition of sparsely distributed stations and complex geomorphic features in the study area, taking advantages of time dimensional information, spatiotemporal heterogeneity and covariates (i.e., elevation) can improve interpolation precision effectively.