Effect of fusing features from multiple DCNN architectures in image classification

被引：41

作者：

Akilan, Thangarajah ^{[1
]}

Wu, Qingming Jonathan ^{[1
]}

Zhang, Hui ^{[2
]}

机构：

[1] Univ Windsor, Dept Elect & Comp Engn, 401 Sunset Ave, Windsor, ON, Canada

[2] Changsha Univ Sci & Technol, Coll Elect & Informat Engn, Changsha, Hunan, Peoples R China

来源：

IET IMAGE PROCESSING | 2018年 / 12卷 / 07期

关键词：

image classification; feature extraction; image representation; neural nets; principal component analysis; image reconstruction; computer vision; DCNN architectures; automatic image classification; pre-trained deep convolutional neural networks; generalised feature space; principal component reconstruction; energy-level normalisation; fused feature vectors; image statistics representation; multiclass linear support vector machine; FFV;

D O I：

10.1049/iet-ipr.2017.0232

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic image classification has become a necessary task to handle the rapidly growing digital image usage. It has branched out many algorithms and adopted new techniques. Among them, feature fusion-based image classification methods rely on hand-crafted features traditionally. However, it has been proven that the bottleneck features extracted through pre-trained convolutional neural networks (CNNs) can improve the classification accuracy. Thence, this study analyses the effect of fusing such cues from multiple architectures without being tied to any hand-crafted features. First, the CNN features are extracted from three different pre-trained models, namely AlexNet, VGG-16, and Inception-V3. Then, a generalised feature space is formed by employing principal component reconstruction and energy-level normalisation, where the features from individual CNN are mapped into a common subspace and embedded using arithmetic rules to construct fused feature vectors (FFVs). This transformation play a vital role in creating a representation that is appearance invariant by capturing complementary information of different high-level features. Finally, a multi-class linear support vector machine is trained. The experimental results demonstrate that such multi-modal CNN feature fusion is well suited for image/object classification tasks, but surprisingly it has not been explored so far by the computer vision research community extensively.

引用

页码：1102 / 1110

页数：9

共 51 条

[31] The Radon Cumulative Distribution Transform and Its Application to Image Classification [J].

Kolouri, Soheil ;

Park, Se Rim ;

Rohde, Gustavo K. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (02) :920-934

[32]

Krizhevsky A., 2009, LEARNING MULTIPLE LA

[33]

Krizhevsky A., 2017, COMMUN ACM, V60, P84, DOI [DOI 10.1145/3065386, 10.1145/3065386]

[34] Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories [J].

Li Fei-Fei ;

Fergus, Rob ;

Perona, Pietro .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 106 (01) :59-70

[35] Local Log-Euclidean Multivariate Gaussian Descriptor and Its Application to Image Classification [J].

Li, Peihua ;

Wang, Qilong ;

Zeng, Hui ;

Zhang, Lei .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :803-817

[36] Feature structure fusion modelling for classification [J].

Lin, Guangfeng ;

Zhu, Hong ;

Kang, Xiaobing ;

Miu, Yalin ;

Zhang, Erhu .

IET IMAGE PROCESSING, 2015, 9 (10) :883-888

[37] Tensor Canonical Correlation Analysis for Multi-View Dimension Reduction [J].

Luo, Yong ;

Tao, Dacheng ;

Ramamohanarao, Kotagiri ;

Xu, Chao ;

Wen, Yonggang .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (11) :3111-3124

[38] Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks [J].

Oquab, Maxime ;

Bottou, Leon ;

Laptev, Ivan ;

Sivic, Josef .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1717-1724

[39] CNN Features off-the-shelf: an Astounding Baseline for Recognition [J].

Razavian, Ali Sharif ;

Azizpour, Hossein ;

Sullivan, Josephine ;

Carlsson, Stefan .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, :512-519

[40]

Ristin M, 2015, PROC CVPR IEEE, P231, DOI 10.1109/CVPR.2015.7298619

← 1 2 3 4 5 6 →