Spatial prediction using random forest spatial interpolation with sample augmentation: a case study for precipitation mapping

被引:0
作者
Jiao Sijia
Wu Tianjun
Luo Jiancheng
Zhou Ya’nan
Dong Wen
Wang Changpeng
Dong Shiying
机构
[1] Chang’an University,School of Sciences
[2] State Key Laboratory of Remote Sensing Science,College of Hydrology and Water Resources
[3] Aerospace Information Research Institute,undefined
[4] Chinese Academy of Sciences,undefined
[5] University of Chinese Academy of Sciences,undefined
[6] College of Resources and Environment,undefined
[7] Hohai University,undefined
来源
Earth Science Informatics | 2023年 / 16卷
关键词
Data augmentation; Random forest; Spatial prediction; Precipitation; Mixup; Upsampling; Small sample;
D O I
暂无
中图分类号
学科分类号
摘要
Spatial prediction(SP) based on machine learning(ML) has been applied to soil water quality, air quality, marine environment, etc. However, there are still deficiencies in dealing with the problem of small samples. Normally, ML requires large amounts of training samples to prevent underfitting. And the data augmentation(DA) methods of mixup and synthetic minority over-sampling technique(SMOTE) ignore the similarity of geographic information. Therefore, this paper proposes a modified upsampling method and combines it with the random forest spatial interpolation(RFSI) to deal with the small sample problem in geographical space. The modified upsampling is mainly reflected in the following two aspects. Firstly, in the process of selecting the nearest points, it is to select points with similar geographic information in some aspects of the category after classification. Secondly, the selected difference is the difference of each category. In order to verify the effectiveness of the proposed method, we use daily precipitation data for January 2018 in Chongqing. The experimental results show that the combination of the modified upsampling method and RFSI effectively improves the accuracy of SP.
引用
收藏
页码:863 / 875
页数:12
相关论文
共 121 条
[1]  
Accion A(2020)Dual-window Superpixel data augmentation for hyperspectral image classification Appl Sci-Basel 10 8833-2268
[2]  
Arguello F(2014)Comparison of elevation and remote sensing derived products as auxiliary data for climate surface interpolation Int J Climatol 34 2258-770
[3]  
Heras DB(2018)Spatial modelling with Euclidean distance fields and machine learning Eur J Soil Sci 69 757-101
[4]  
Alvarez O(2014)Geostatistical merging of rain gauge and radar data for high temporal resolutions and various station density scenarios J Hydrol 508 88-32
[5]  
Guo Q(2001)Random forests Mach Learn 45 5-357
[6]  
Klinger RC(2002)SMOTE: synthetic minority over-sampling technique J Artif Intell Res 16 321-28
[7]  
Li W(1998)Support vector machines IEEE Intell Syst Their Appl 13 18-1602
[8]  
Doherty P(2018)Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables Peerj 6 e5518-113
[9]  
Behrens T(2020)High accuracy geochemical map generation method by a spatial autocorrelation-based mixture interpolation using remote sensing data Remote Sens 12 1991-597
[10]  
Schmidt K(2012)Estimating the spatial pattern of human-caused forest fires using a generalized linear mixed model with spatial autocorrelation in South Korea Int J Geogr Inf Sci 26 1589-83