Advances of Four Machine Learning Methods for Spatial Data Handling: a Review

被引:139
作者
Du, Peijun [1 ,2 ,3 ]
Bai, Xuyu [1 ,2 ,3 ]
Tan, Kun [4 ]
Xue, Zhaohui [5 ]
Samat, Alim [6 ]
Xia, Junshi [7 ]
Li, Erzhu [8 ]
Su, Hongjun [5 ]
Liu, Wei [8 ]
机构
[1] Nanjing Univ, Sch Geog & Ocean Sci, Nanjing 210023, Peoples R China
[2] Key Lab Land Satellite Remote Sensing Applicat, Minist Nat Resources China, Nanjing 210023, Peoples R China
[3] Jiangsu Ctr Collaborat Innovat Geog, Informat Resource Dev & Applicat, Nanjing 210023, Peoples R China
[4] East China Normal Univ, Minist Educ, Key Lab Geog Informat Sci, Shanghai 200241, Peoples R China
[5] Hohai Univ, Sch Earth Sci & Engn, Nanjing 211100, Peoples R China
[6] Chinese Acad Sci, Xinjiang Inst Ecol & Geog, State Key Lab Desert & Oasis Ecol, Urumqi 830011, Peoples R China
[7] RIKEN Ctr Adv Intelligence Project, Tokyo 1030027, Japan
[8] Jiangsu Normal Univ, Sch Geog, Geomat & Planning, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Remote sensing image classification; Spatial interpolation; Support vector machine; Ensemble learning; Deep learning; Semi-supervised learning; Active learning; HYPERSPECTRAL IMAGE CLASSIFICATION; SUPPORT VECTOR MACHINES; CONVOLUTIONAL NEURAL-NETWORKS; REMOTE-SENSING IMAGES; SCENE CLASSIFICATION; FEATURE-SELECTION; FEATURE-EXTRACTION; ENSEMBLE; SVM; INFORMATION;
D O I
10.1007/s41651-020-00048-5
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Most machine learning tasks can be categorized into classification or regression problems. Regression and classification models are normally used to extract useful geographic information from observed or measured spatial data, such as land cover classification, spatial interpolation, and quantitative parameter retrieval. This paper reviews the progress of four advanced machine learning methods for spatial data handling, namely, support vector machine (SVM)-based kernel learning, semi-supervised and active learning, ensemble learning, and deep learning. These four machine learning modes are representative because they improve learning performances from different views, for example, feature space transform and decision function (SVM), optimized uses of samples (semi-supervised and active learning), and enhanced learning models and capabilities (ensemble learning and deep learning). For spatial data handling via machine learning that can be improved by the four machine learning models, three key elements are learning algorithms, training samples, and input features. To apply machine learning methods to spatial data handling successfully, a four-level strategy is suggested: experimenting and evaluating the applicability, extending the algorithms by embedding spatial properties, optimizing the parameters for better performance, and enhancing the algorithm by multiple means. Firstly, the advances of SVM are reviewed to demonstrate the merits of novel machine learning methods for spatial data, running the line from direct use and comparison with traditional classifiers, and then targeted improvements to address multiple class problems, to optimize parameters of SVM, and to use spatial and spectral features. To overcome the limits of small-size training samples, semi-supervised learning and active learning methods are then utilized to deal with insufficient labeled samples, showing the potential of learning from small-size training samples. Furthermore, considering the poor generalization capacity and instability of machine learning algorithms, ensemble learning is introduced to integrate the advantages of multiple learners and to enhance the generalization capacity. The typical research lines, including the combination of multiple classifiers, advanced ensemble classifiers, and spatial interpolation, are presented. Finally, deep learning, one of the most popular branches of machine learning, is reviewed with specific examples for scene classification and urban structural type recognition from high-resolution remote sensing images. By this review, it can be concluded that machine learning methods are very effective for spatial data handling and have wide application potential in the big data era.
引用
收藏
页数:25
相关论文
共 144 条
[1]   Spectral-spatial feature learning for hyperspectral imagery classification using deep stacked sparse autoencoder [J].
Abdi, Ghasem ;
Samadzadegan, Farhad ;
Reinartz, Peter .
JOURNAL OF APPLIED REMOTE SENSING, 2017, 11
[2]  
Abe N., 1998, Machine Learning. Proceedings of the Fifteenth International Conference (ICML'98), P1
[3]   Efficient Machine Learning for Big Data: A Review [J].
Al-Jarrah, Omar Y. ;
Yoo, Paul D. ;
Muhaidat, Sami ;
Karagiannidis, George K. ;
Taha, Kamal .
BIG DATA RESEARCH, 2015, 2 (03) :87-93
[4]   Data representations and generalization error in kernel based learning machines [J].
Ancona, Nicola ;
Maglietta, Rosalia ;
Stella, Ettore .
PATTERN RECOGNITION, 2006, 39 (09) :1588-1603
[5]  
[Anonymous], 2010, P 2010 2 WORKSH HYP
[6]  
[Anonymous], 2008, P 25 INT C MACH LEAR, DOI DOI 10.1145/1390156.1390294
[7]  
[Anonymous], 2001, Swarm Intelligence
[8]  
[Anonymous], 2000, ICML
[9]   A Graph-Based Classification Method for Hyperspectral Images [J].
Bai, Jun ;
Xiang, Shiming ;
Pan, Chunhong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2013, 51 (02) :803-817
[10]   Classification of hyperspectral data from urban areas based on extended morphological profiles [J].
Benediktsson, JA ;
Palmason, JA ;
Sveinsson, JR .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2005, 43 (03) :480-491