Hyperspectral image classification of the deep neural network based on 3D convolution and dense connection

被引:0
|
作者
Song T. [1 ]
Zong D. [1 ]
Liu T. [1 ]
Fan H. [2 ]
Huang T. [2 ]
Jiang X. [2 ]
Wang H. [1 ]
机构
[1] College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao
[2] Zhuhai Orbita Aerospace Science&Technology Co., Ltd. Artificial Intelligence Research Institute, Zhuhai
关键词
deep learning; dense connection; hyperspectral image classification; lightweight network; separable convolution;
D O I
10.11834/jrs.20210313
中图分类号
学科分类号
摘要
With the progress of deep learning, researchers are increasingly paying attention to its application in hyperspectral image classification. Many experiments are conducted to achieve a trade-off between accuracy and efficiency to improve the feature extraction performance of neural networks toward small training sample sets. This work has proposed a high-speed and high-precision neural network structure based on spatial spectral information. A cascaded neural network for spectral spatial information extraction is constructed by combining the idea of DenseNet and adopting dilated convolutions instead of 3D convolutions as the main calculation method. The whole network structure is divided into four components: spectral information extraction, spectral compression, fusion of spatial and spectral information, and voting solution. Three convolutional layers are built in the spectral information extraction component. In each layer, 1×1×7 convolution kernels are used to extract spectral information and maintain the independence of spatial information. The number of kernels is set to 60. In light of the DenseNet idea, the network outputs of the first and second layers are dimensionally split in spectrum and inputted into the third layer. The outputs of the first, second, and third layers are also dimensionally split and inputted into the spectral compression component. In the spectral compression component, a 1×1×7 convolution kernel is used with a step size set to three. The spectral dimension is compressed, and the number of parameters of the deeper network is lessened by reducing the size of the feature map. In the spatial and spectral information fusion component, the goal is to fuse spatial information for the first time with 3×3 receptive fields and integrate the spectral information of the data. Separable convolutions are adopted instead of traditional 3D convolutions, and the 3×3×K convolution kernel is decomposed into a 3×3×1 convolution and a 1×1×K convolution. The value of K is equal to the spectral dimension of the input feature map. Then, 40 9×9×1 feature maps are outputted. Voting means that if the output of most pixels is the same value, then the average value of all values will also be pulled near this certain value. In the voting solution component using parameter-free global average pooling, the 9×9×1 feature maps are voted to obtain 1×1×1 output values. These 40 output values are spliced into the fully connected layer, and the classification results our outputted through Softmax. A series of experiments were carried out on the Indian Pains and Pavia University and Kennedy Space Center datasets. In the IP data set, the average accuracy reaches 95.0%, the overall accuracy 97.4%, and Kappa 0.97 by training with 5% data sets. In the UP data set, OA, AA, and Kappa reach 97.6%, 97.1%, and 0.97, respectively, by training with a 0.5% data set. The overall accuracy in the KSC data set can reach 99.2%. The network has been proven to strong feature extraction and classification ability. This method effectively improves the classification accuracy of hyperspectral images in the case of small sample sets and studies the effect of training and input data sizes on the classification accuracy. The classification accuracy of the network is improved with the increase in the training or input data. However, redundant information generated by a large amount of training data and excessive input data does not help improve the classification performance. © 2022 National Remote Sensing Bulletin. All rights reserved.
引用
收藏
页码:2317 / 2328
页数:11
相关论文
共 37 条
  • [1] Archibald R, Fann G., Feature selection and classification of hyperspectral images with support vector machines, IEEE Geoscience and Remote Sensing Letters, 4, 4, pp. 674-677, (2007)
  • [2] Ball J E, Wei P., Deep learning hyperspectral image classification using multiple class-based denoising autoencoders, mixed pixel training augmentation, and morphological operations, IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 6903-6906, (2018)
  • [3] Chang C I, Zhao X L, Althouse M L G, Pan J J., Least squares subspace projection approach to mixed pixel classification for hyperspectral images, IEEE Transactions on Geoscience and Remote Sensing, 36, 3, pp. 898-912, (1998)
  • [4] Cui B G, Ma X D, Xie X Y, Hyperspectral image de-noising and classification with small training samples, Journal of Remote Sensing, 21, 5, pp. 728-738, (2017)
  • [5] Fu W, Li S T, Fang L Y., Spectral-spatial hyperspectral image classification via superpixel merging and sparse representation, 2015 IEEE International Geoscience and Remote Sensing Symposium, pp. 4971-4974, (2015)
  • [6] Gao Q S, Lim S, Jia X P., Hyperspectral image classification using convolutional neural networks and multiple feature learning, Remote Sensing, 10, 2, (2018)
  • [7] He M Y, Li B, Chen H H., Multi-scale 3D deep convolutional neural network for hyperspectral image classification, 2017 IEEE International Conference on Image Processing (ICIP), pp. 3904-3908, (2017)
  • [8] Huang G, Liu Z, Laurens V, Weinberger K. Q., Densely connected convolutional networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1063-6919, (2017)
  • [9] Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M, Adam H., MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, (2017)
  • [10] Ioffe S, Szegedy C., Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, (2015)