Scene classification of remote sensing image using ensemble convolutional neural network

被引:0
作者
Yu D. [1 ]
Zhang B. [1 ]
Zhao C. [1 ]
Guo H. [1 ]
Lu J. [1 ]
机构
[1] Information Engineering University, Zhengzhou
来源
Yaogan Xuebao/Journal of Remote Sensing | 2020年 / 24卷 / 06期
基金
中国国家自然科学基金;
关键词
Convolutional neural network; Ensemble learning; Remote sensing image; Scene classification; Transfer learning;
D O I
10.11834/jrs.20208273
中图分类号
学科分类号
摘要
Scene classification and recognition of remote sensing image is an important task for image interpretation. High-resolution remote sensing images have rich spatial texture features and semantic information, and their scene categories are diversified. As a result, images in the same category have a huge difference and some images in different categories become similar. which makes images difficult to be classified and recognized correctly. Therefore, choosing effective features and classification algorithms can improve classification performance. In this case, high-precision classification can only be achieved by selecting effective features and classifiers. Traditional scene classification algorithms adopt low-level or mid-level handcrafted features. These features have poor ability to represent high-level semantic information of images, which makes it difficult to achieve satisfactory results on massive complex scene images difficult. Deep learning, especially convolutional neural networks, has made great progress in computer vision. Compared with the methods using handcrafted features, deep learning is currently the most effective way for image classification. The application of a convolutional neural network to remote sensing image classification has achieved higher precision than methods using traditional handcrafted features. However, training a deep convolutional neural network that has too many parameters needs many labeled images, and the process of training is complicated and time-consuming. Generally, a deep convolutional neural network would not perform well with only a few images. A method for image classification using an ensemble convolutional neural network is proposed to improve the performance of convolutional neural networks. The method is composed of three main phases, namely, preprocessing, feature extraction, and ensemble learning. Firstly, the preprocessing stage includes geometry normalization, image intensity normalization, and image augmentation. Secondly, the feature extraction phase considers several deep convolutional neural networks, which have been well pre-trained on ImageNet, and are chosen to remove the last classification layer in the network and to extract different deep features of the same image. Thirdly, a stacking model is constructed in the ensemble learning phase. The stacking model consists of base and meta classifiers. The base classifier is composed of several logistic regression modes that are used to train different features extracted by deep convolutional neural networks. The meta classifier is a support vector machine. Finally, the probability distribution predicted by the base classifier is used to construct a new dataset that would be trained by the meta classifier. Experiments were conducted on two datasets named UCMerced_LandUse and NWPU-RESISC45 to verify the effectiveness of the proposed method. Compared with state-of-the-art methods, the proposed method performed better in overall accuracies. The proposed method could greatly improve performance and achieve overall accuracies of 90.74% and 87.21% on the two datasets, respectively, even with only 10% data used for training. With transfer learning, the features extracted by the deep convolutional neural networks are highly abstract and semantic, which have better ability in classification than other handcrafted features. Through feature fusion and model transferring, the advantages of different features and classification methods could be synthetically utilized. In this way, high classification accuracy could be achieved even with very little training data. © 2020, Science Press. All right reserved.
引用
收藏
页码:717 / 727
页数:10
相关论文
共 27 条
[1]  
Chang L, Deng X M, Zhou M Q, Wu Z K, Yuan Y, Yang S, Wang H A., Convolutional neural networks in image understanding, Acta Automatica Sinica, 42, 9, pp. 1300-1312, (2016)
[2]  
Cheng G, Han J W, Lu X Q., Remote sensing image scene classification: benchmark and state of the art, Proceedings of the IEEE, 105, 10, pp. 1865-1883, (2017)
[3]  
Chollet F., Xception: deep learning with depthwise separable convolutions, Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1800-1807, (2017)
[4]  
Dalal N, Triggs B., Histograms of oriented gradients for human detection, Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, (2005)
[5]  
He K M, Zhang X Y, Ren S Q, Sun J, Research M., Deep residual learning for image recognition, Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[6]  
He X F, Zou Z R, Tao C, Zhang J X., Combined saliency with multi-convolutional neural network for high resolution remote sensing scene classification, Acta Geodaetica et Cartographica Sinica, 45, 9, pp. 1073-1080, (2016)
[7]  
Hu F, Xia G S, Hu J W, Zhang L P., Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery, Remote Sensing, 7, 11, pp. 14680-14707, (2015)
[8]  
Huang L H, Chen C, Li W, Du Q., Remote sensing image scene classification using multi-scale completed local binary patterns and fisher vectors, Remote Sensing, 8, 6, (2016)
[9]  
Krizhevsky A, Sutskever I, Hinton G E., ImageNet classification with deep convolutional neural networks, Proceedings of the 25th International Conference on Neural Information Processing Systems, pp. 1907-1105, (2012)
[10]  
Nogueira K, Penatti O A B, dos Santos J A., Towards better exploiting convolutional neural networks for remote sensing scene classification, Pattern Recognition, 61, pp. 539-556, (2016)