A multi-view-CNN framework for deep representation learning in image classification

被引：17

作者：

Pintelas, Emmanuel ^{[1
]}

Livieris, Ioannis E. ^{[1
]}

Kotsiantis, Sotiris ^{[1
]}

Pintelas, Panagiotis ^{[1
]}

机构：

[1] Univ Patras, Dept Math, GR-26500 Patras, Greece

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2023年 / 232卷

关键词：

Transfer learning; Convolutional neural networks; Deep learning; Feature augmentation; Dimensionality reduction; Image classification; DIMENSIONALITY;

D O I：

10.1016/j.cviu.2023.103687

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep representation learning in image classification is an area in computer vision where deep Convolutional Neural Networks (CNNs) have flourished. Nevertheless, developing an efficient image recognition model for real world applications is a challenging task, since image datasets are characterized by instances with a large amount of noise and redundant information. Thus, it is essential to incorporate an intelligent feature extraction and filtering method in order to create robust and efficient image representations. In this work, we propose a Multi-View-CNN framework which drastically boosts the performance of pre-trained CNN models, such as ResNet and VGG in image classification applications. In this approach different type of views of the same initial image are used in order to extract different types of features utilizing pre-trained CNN models. However, in order to reduce the huge dimensional size of the raw CNN's output features and create a robust image representation, the Principal Component Analysis (PCA) dimension reduction method is applied. Then, all these extracted feature vectors are concatenated building a final composite feature representation of the initial image dataset. Finally, this augmented feature vector is used for training a linear model (Logistic Regression) in order to perform the final classification tasks. The main findings of this work are summarized as follows. First, the proposed Multi-View-CNN framework managed to drastically increase the performance results of pre-trained CNN models. Second, the incorporation of PCA as a final layer into the main CNN topology, instead of using the classical dimension reduction layer components such as Averaging and Max Pooling operations, managed to significantly improve the performance. The whole implementation code of this framework alongside with the datasets used in our experimental simulations was uploaded to our public GitHub repository to the following link: https://github.com/EmmanuelPintelas/A-Multi-View-CNN-Framework-for-Deep-Representation-Learning-in-Image-Classification.

引用

页数：12

共 45 条

[1]

Angeloni M.A., 2017, IEEE INT WORKS MACH, P1

[2]

[Anonymous], 2018, UMAP UNIFORM MANIFOL

[3]

[Anonymous], 2015, ICLR

[4]

Baldi P., 2012, JMLR WORKSHOP C P, P37

[5] Xception: Deep Learning with Depthwise Separable Convolutions [J].