Rethinking 3D-CNN in Hyperspectral Image Super-Resolution

被引:2
作者
Liu, Ziqian [1 ]
Wang, Wenbing [1 ]
Ma, Qing [1 ]
Liu, Xianming [1 ]
Jiang, Junjun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
3D convolution; hyperspectral image; super-resolution; convolutional neural network;
D O I
10.3390/rs15102574
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Recently, CNN-based methods for hyperspectral image super-resolution (HSISR) have achieved outstanding performance. Due to the multi-band property of hyperspectral images, 3D convolutions are natural candidates for extracting spatial-spectral correlations. However, pure 3D CNN models are rare to see, since they are generally considered to be too complex, require large amounts of data to train, and run the risk of overfitting on relatively small-scale hyperspectral datasets. In this paper, we question this common notion and propose Full 3D U-Net (F3DUN), a full 3D CNN model combined with the U-Net architecture. By introducing skip connections, the model becomes deeper and utilizes multi-scale features. Extensive experiments show that F3DUN can achieve state-of-the-art performance on HSISR tasks, indicating the effectiveness of the full 3D CNN on HSISR tasks, thanks to the carefully designed architecture. To further explore the properties of the full 3D CNN model, we develop a 3D/2D mixed model, a popular kind of model prior, called Mixed U-Net (MUN) which shares a similar architecture with F3DUN. Through analysis on F3DUN and MUN, we find that 3D convolutions give the model a larger capacity; that is, the full 3D CNN model can obtain better results than the 3D/2D mixed model with the same number of parameters when it is sufficiently trained. Moreover, experimental results show that the full 3D CNN model could achieve competitive results with the 3D/2D mixed model on a small-scale dataset, suggesting that 3D CNN is less sensitive to data scaling than what people used to believe. Extensive experiments on two benchmark datasets, CAVE and Harvard, demonstrate that our proposed F3DUN exceeds state-of-the-art HSISR methods both quantitatively and qualitatively.
引用
收藏
页数:21
相关论文
共 79 条
[1]   Super-resolution reconstruction of hyperspectral images [J].
Akgun, T ;
Altunbasak, Y ;
Mersereau, RM .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2005, 14 (11) :1860-1875
[2]  
Akhtar N, 2015, PROC CVPR IEEE, P3631, DOI 10.1109/CVPR.2015.7298986
[3]  
[Anonymous], 2018, IEEE T NEUR NET LEAR, DOI DOI 10.1109/TNNLS.2018.2798162
[4]  
[Anonymous], 2013, P 2013 5 WORKSHOP HY
[5]   Projection algorithms for solving convex feasibility problems [J].
Bauschke, HH ;
Borwein, JM .
SIAM REVIEW, 1996, 38 (03) :367-426
[6]  
Ben Niu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P191, DOI 10.1007/978-3-030-58610-2_12
[7]   A Novel Hyperspectral Image Classification Model Using Bole Convolution With Three-Direction Attention Mechanism: Small Sample and Unbalanced Learning [J].
Cai, Weiwei ;
Ning, Xin ;
Zhou, Guoxiong ;
Bai, Xiao ;
Jiang, Yizhang ;
Li, Wei ;
Qian, Pengjiang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[8]  
Chakrabarti A, 2011, PROC CVPR IEEE, P193, DOI 10.1109/CVPR.2011.5995660
[9]   Pre-Trained Image Processing Transformer [J].
Chen, Hanting ;
Wang, Yunhe ;
Guo, Tianyu ;
Xu, Chang ;
Deng, Yiping ;
Liu, Zhenhua ;
Ma, Siwei ;
Xu, Chunjing ;
Xu, Chao ;
Gao, Wen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12294-12305
[10]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848