Integrating OpenStreetMap crowdsourced data and Landsat time series imagery for rapid land use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines

被引:117
作者
Johnson, Brian A. [1 ]
Iizuka, Kotaro [2 ]
机构
[1] Inst Global Environm Strategies, 2108-11 Kamiyamaguchi, Hayama, Kanagawa 2400115, Japan
[2] Kyoto Univ, Res Inst Sustainable Humanosphere, Uji, Kyoto 6110011, Japan
关键词
OpenStreetMap; Volunteered geographic information; Citizen science; Crowdsourced data; Random forest; Landsat; 8; Google Earth Engine; SUPPORT VECTOR MACHINES; TRAINING DATA; FOREST; INFORMATION; URBAN; CLASSIFICATIONS; ACCURACY; MAPS;
D O I
10.1016/j.apgeog.2015.12.006
中图分类号
P9 [自然地理学]; K9 [地理];
学科分类号
0705 ; 070501 ;
摘要
We explored the potential for rapid land use/land cover (WLC) mapping using time-series Landsat satellite imagery and training data (for supervised classification) automatically extracted from crowd sourced OpenStreetMap (OSM) "landuse" (OSM-LU) and "natural" (OSM-N) polygon datasets. The main challenge with using these data for LULC classification was their high level of noise, as the Landsat images all contained varying degrees of cloud cover (causes of attribute noise) and the OSM polygons contained locational errors and class labeling errors (causes of class noise). A second challenge arose from the imbalanced class distribution in the extracted training data, which occurred due to wide discrepancies in the area coverage of each OSM-LU/OSM-N class. To address the first challenge, three relatively noise tolerant algorithms - naive bayes (NB), decision tree (C4.5 algorithm), and random forest (RF) were evaluated for image classification. To address the second challenge, artificial training samples were generated for the minority classes using the synthetic minority over-sampling technique (SMOTE). Image classification accuracies were calculated for a four-class, five-class, and six-class LULC system to assess the capability of the proposed methods for mapping relatively broad as well as more detailed LULC types, and the highest overall accuracies achieved were 84.0% (four-class SMOTE-RF result), 81.0% (five-class SMOTE-RF result), and 72.0% (six-class SMOTE-NB result). RF and NB had relatively similar overall accuracies, while those of C4.5 were much lower. SMOTE led to higher classification accuracies for RF and C4.5, and in some cases for NB, despite the noise in the training set. The main advantages of the proposed methods are their cost- and time-efficiency, as training data for supervised classification is automatically extracted from the crowdsourced datasets and no pre-processing for cloud detection/cloud removal is performed. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:140 / 149
页数:10
相关论文
共 52 条
[1]  
Arsanjani JJ, 2013, INT ARCH PHOTOGRAMM, V40-4-W1, P51
[2]   Quality assessment of the contributed land use information from OpenStreetMap versus authoritative datasets [J].
Jokar Arsanjani, Jamal ;
Mooney, Peter ;
Zipf, Alexander ;
Schauss, Anne .
Lecture Notes in Geoinformation and Cartography, 2015, 0 (9783319142791) :37-58
[3]   Comparing supervised and unsupervised multiresolution segmentation approaches for extracting buildings from very high resolution imagery [J].
Belgiu, Mariana ;
Dragut, Lucian .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 96 :67-75
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Identifying mislabeled training data [J].
Brodley, CE ;
Friedl, MA .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 :131-167
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]  
Clark P., 1989, Machine Learning, V3, P261, DOI 10.1023/A:1022641700528
[8]   A REVIEW OF ASSESSING THE ACCURACY OF CLASSIFICATIONS OF REMOTELY SENSED DATA [J].
CONGALTON, RG .
REMOTE SENSING OF ENVIRONMENT, 1991, 37 (01) :35-46
[9]   Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information [J].
Cracknell, Matthew J. ;
Reading, Anya M. .
COMPUTERS & GEOSCIENCES, 2014, 63 :22-33
[10]   Land use and land cover change in Greater Dhaka, Bangladesh: Using remote sensing to promote sustainable urbanization [J].
Dewan, Ashraf M. ;
Yamaguchi, Yasushi .
APPLIED GEOGRAPHY, 2009, 29 (03) :390-401