Due to the signal reflection and diffraction, site-specific unmodeled errors like multipath effect and Non-Line-of-Sight reception are significant error sources in Global Navigation Satellite System since they cannot be easily mitigated. However, how to characterize and model the internal mechanisms and external influences of these site-specific unmodeled errors are still to be investigated. Therefore, we propose a method for characterizing and modeling site-specific unmodeled errors under reflection and diffraction using a data-driven approach. Specifically, we first consider all the popular potential features, which generate the site-specific unmodeled errors. We then use the random forest regression to comprehensively analyze the correlations between the site-specific unmodeled errors and the potential features. We finally characterize and model the site-specific unmodeled errors. Two 7-consecutive datasets dominated by signal reflection and diffraction were conducted. The results show that there are significant differences in the correlations with potential features. They are highly related to the application scenarios, observation types, and satellite types. Notably, the innovation vector often shows a strong correlation with the code site-specific unmodeled errors. For the phase site-specific unmodeled errors, they have high correlations with elevation, azimuth, number of visible satellites, and between-frequency differenced phase observations. In the environments of reflection and diffraction, the sum of the correlations of the top six potential features can reach approximately 88.5 and 87.7%, respectively. Meanwhile, these correlations are stable for different observation types and satellite types. With the integration of a transformer model with the random forest method, a high-precision unmodeled error prediction model is established, demonstrating the necessity to include multiple features for accurate and efficient characterization and modeling of site-specific unmodeled errors.