SPATIAL ENSEMBLE KERNEL LEARNING FOR SCENE CLASSIFICATION

被引：0

作者：

Zhang, Lei ^{[1
]}

Zhen, Xiantong ^{[2
]}

Zhang, Qiujing ^{[1
]}

机构：

[1] Guangdong Univ Petrochem Technol, Coll Comp & Elect Informat, Maoming, Peoples R China

[2] Beihang Univ, Sch Elect & Informat Engn, Beijing, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

基金：

美国国家科学基金会;

关键词：

Spatial Ensemble Kernel; CNNs; Fourier Feature Embedding; Spatial Pyramid Kernel; Scene Classification;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Scene recognition is one of the most important tasks in computer vision. Apart from appearance, spatial layout carries the crucial cue for discriminative representation. In this paper, we propose spatial ensemble kernel (SEK) learning, which enables fusion of multi-scale spatial information to achieve compact while discriminative representation of scenes. Based on the spatial pyramid, SEK combines the CNN features in each level of the pyramid in an ensemble and fuse them by kernels. By kernel approximation, we achieve Fourier feature embedding of CNN features in each scale, which establishes a nonlinear layer of the neural network with a cosine activation function. The parameters of the nonlinear layer can be learned jointly in one single optimization framework by supervised learning, which enables compact and discriminative feature representations. We show the effectiveness of the proposed SEK on two recent scene benchmark datasets, i.e., MIT indoor and SUN 397. The propose SEK produces high performance on two datasets which are competitive to state-of-the-art algorithms.

引用

页码：1303 / 1307

页数：5

共 24 条

[11]

Dixit M, 2015, PROC CVPR IEEE, P2974, DOI 10.1109/CVPR.2015.7298916

[12]

Dixit Mandar, 2016, ADV NEURAL INFORM PR

[13] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[14]

Girshick R., 2014, P IEEE C COMP VIS PA, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81, 10.1109/cvpr.2014.81]

[15]

Gong YC, 2014, LECT NOTES COMPUT SC, V8695, P392, DOI 10.1007/978-3-319-10584-0_26

[16] ImageNet Classification with Deep Convolutional Neural Networks [J].

Krizhevsky, Alex ;

Sutskever, Ilya ;

Hinton, Geoffrey E. .

COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90

[17]

McManus C, 2014, IEEE INT CONF ROBOT, P901, DOI 10.1109/ICRA.2014.6906961

[18]

Middelberg S, 2014, LECT NOTES COMPUT SC, V8690, P268, DOI 10.1007/978-3-319-10605-2_18

[19] Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks [J].

Oquab, Maxime ;

Bottou, Leon ;

Laptev, Ivan ;

Sivic, Josef .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1717-1724

[20]

PERRONNIN F, 2010, PROC CVPR IEEE, P3384, DOI DOI 10.1109/CVPR.2010.5540009

← 1 2 3 →