Image classification using convolutional neural network with wavelet domain inputs

被引:9
作者
Wang, Luyuan [1 ]
Sun, Yankui [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Neural network models - Image classification - Image enhancement - Wavelet transforms - Classification (of information) - Convolution - Textures;
D O I
10.1049/ipr2.12466
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonly used convolutional neural networks (CNNs) usually compress high-resolution input images. Although it reduces the computation requirements into a reasonable range, the downsampling operation causes information loss, which affects the accuracy of image classification. How to adopt high-resolution image inputs to improve the quality of input information and thus improve the classification accuracy without changing the overall structure of the pre-defined CNN model or increasing the model parameters is an important issue. Here, a CNN model with wavelet domain inputs is proposed to provide a solving scheme. Specifically, the proposed method applies wavelet packet transform or dual-tree complex wavelet transform to extract information from input images with higher resolutions in the image pre-processing stage. Some subband image channels are selected as the inputs of conventional CNNs where the first several convolutional layers are removed, so that the networks directly learn in the wavelet domain. Experiment results on the Caltech-256 dataset and the Describable Textures Dataset with the ResNet-50 show that the classification accuracy of our method can have a maximum improvement of 2.15% and 10.26%, respectively. These validate the effectiveness of our proposed scheme. This code is publicly available at .
引用
收藏
页码:2037 / 2048
页数:12
相关论文
共 31 条
[1]   Image coding using wavelet transform [J].
Antonini, Marc ;
Barlaud, Michel ;
Mathieu, Pierre ;
Daubechies, Ingrid .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1992, 1 (02) :205-220
[2]   The JPEG2000 still image coding system: An overview [J].
Christopoulos, C ;
Skodras, A ;
Ebrahimi, T .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2000, 46 (04) :1103-1127
[3]   Describing Textures in the Wild [J].
Cimpoi, Mircea ;
Maji, Subhransu ;
Kokkinos, Iasonas ;
Mohamed, Sammy ;
Vedaldi, Andrea .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3606-3613
[4]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[5]  
Cotter F., 2020, Uses of complex wavelets in deep convolutional neural networks
[6]   SAR Image segmentation based on convolutional-wavelet neural network and markov random field [J].
Duan, Yiping ;
Liu, Fang ;
Jiao, Licheng ;
Zhao, Peng ;
Zhang, Lu .
PATTERN RECOGNITION, 2017, 64 :255-267
[7]  
Fujieda S., 2017, ARXIV170707394
[8]  
Griffin G., 2007, Technical Report 7694
[9]  
Gueguen L, 2018, ADV NEUR IN, V31
[10]   Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1026-1034