A multi-view-CNN framework for deep representation learning in image classification

被引:17
作者
Pintelas, Emmanuel [1 ]
Livieris, Ioannis E. [1 ]
Kotsiantis, Sotiris [1 ]
Pintelas, Panagiotis [1 ]
机构
[1] Univ Patras, Dept Math, GR-26500 Patras, Greece
关键词
Transfer learning; Convolutional neural networks; Deep learning; Feature augmentation; Dimensionality reduction; Image classification; DIMENSIONALITY;
D O I
10.1016/j.cviu.2023.103687
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep representation learning in image classification is an area in computer vision where deep Convolutional Neural Networks (CNNs) have flourished. Nevertheless, developing an efficient image recognition model for real world applications is a challenging task, since image datasets are characterized by instances with a large amount of noise and redundant information. Thus, it is essential to incorporate an intelligent feature extraction and filtering method in order to create robust and efficient image representations. In this work, we propose a Multi-View-CNN framework which drastically boosts the performance of pre-trained CNN models, such as ResNet and VGG in image classification applications. In this approach different type of views of the same initial image are used in order to extract different types of features utilizing pre-trained CNN models. However, in order to reduce the huge dimensional size of the raw CNN's output features and create a robust image representation, the Principal Component Analysis (PCA) dimension reduction method is applied. Then, all these extracted feature vectors are concatenated building a final composite feature representation of the initial image dataset. Finally, this augmented feature vector is used for training a linear model (Logistic Regression) in order to perform the final classification tasks. The main findings of this work are summarized as follows. First, the proposed Multi-View-CNN framework managed to drastically increase the performance results of pre-trained CNN models. Second, the incorporation of PCA as a final layer into the main CNN topology, instead of using the classical dimension reduction layer components such as Averaging and Max Pooling operations, managed to significantly improve the performance. The whole implementation code of this framework alongside with the datasets used in our experimental simulations was uploaded to our public GitHub repository to the following link: https://github.com/EmmanuelPintelas/A-Multi-View-CNN-Framework-for-Deep-Representation-Learning-in-Image-Classification.
引用
收藏
页数:12
相关论文
共 45 条
[1]  
Angeloni M.A., 2017, IEEE INT WORKS MACH, P1
[2]  
[Anonymous], 2018, UMAP UNIFORM MANIFOL
[3]  
[Anonymous], 2015, ICLR
[4]  
Baldi P., 2012, JMLR WORKSHOP C P, P37
[5]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[6]   CNN-based features for retrieval and classification of food images [J].
Ciocca, Gianluigi ;
Napoletano, Paolo ;
Schettini, Raimondo .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 176 :70-77
[7]  
Durall Ricard, 2019, Unmasking DeepFakes with simple Features
[8]   Classification of hyperspectral images with convolutional neural networks and probabilistic relaxation [J].
Gao, Qishuo ;
Lim, Samsung .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 188
[9]   Res2Net: A New Multi-Scale Backbone Architecture [J].
Gao, Shang-Hua ;
Cheng, Ming-Ming ;
Zhao, Kai ;
Zhang, Xin-Yu ;
Yang, Ming-Hsuan ;
Torr, Philip .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) :652-662
[10]  
Geirhos R., 2019, INT C LEARN REPR