Hierarchical Coding of Convolutional Features for Scene Recognition

被引：27

作者：

Xie, Lin ^{[1
]}

Lee, Feifei ^{[1
]}

Liu, Li ^{[2
]}

Yin, Zhong ^{[1
]}

Chen, Qiu ^{[3
]}

机构：

[1] Univ Shanghai Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 200093, Peoples R China

[2] Nanchang Univ, Sch Informat Engn, Nanchang 330500, Jiangxi, Peoples R China

[3] Kogakuin Univ, Elect Engn & Elect, Grad Sch Engn, Tokyo 1638677, Japan

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2020年 / 22卷 / 05期

关键词：

Visualization; Convolutional codes; Encoding; Image representation; Feature extraction; Image recognition; Image coding; Convolutional feature; Inter-class linear coding; Non-negative sparse decomposition; Scene recognition; IMAGE CLASSIFICATION; REPRESENTATION; NETWORK; SCALE;

D O I：

10.1109/TMM.2019.2942478

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional neural networks (CNNs) have achieved great success in visual recognition because of the availability of large-scale image datasets, such as the ImageNet. The transfer of convolutional features to challenging scene recognition remains an open problem. Multiple non-linear transforms endow the convolutional features with abundant information. On the other side, CNNs are adept at capturing the holistic appearances of scenes, whereas the lack of some critical local details may reduce the recognition accuracy. To address these problems, we propose a novel hierarchical coding algorithm to learn effective representations. To adapt the scale variations, many useful patches with various scales sampled from the whole image are considered to provide the sufficient details. Non-negative sparse decomposition model (NNSD) based on convolutional features is proposed to learn the sharable components for each scale and further produce global signatures. Based on the global signatures, inter-class linear coding (ICLC) is proposed to learn the discriminative components and ultimate image representations. Experimental results indicate that our approach significantly improves the recognition accuracy compared with general CNN models and achieves excellent performance on five standard benchmarks.

引用

页码：1182 / 1192

页数：11

共 71 条

[1]

[Anonymous], 2014, NIPS

[2]

[Anonymous], 2013, NIPS

[3] Multiple Stage Residual Model for Image Classification and Vector Compression [J].

Bai, Song ;

Bai, Xiang ;

Liu, Wenyu .

IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (07) :1351-1362

[4] Scene classification using a hybrid generative/discriminative approach [J].

Bosch, Anna ;

Zisserman, Andrew ;

Munoz, Xavier .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (04) :712-727

[5] Deep Feature Fusion for VHR Remote Sensing Scene Classification [J].

Chaib, Souleyman ;

Liu, Huan ;

Gu, Yanfeng ;

Yao, Hongxun .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (08) :4775-4784

[6] Pyramid of Spatial Relatons for Scene-Level Land Use Classification [J].

Chen, Shizhi ;

Tian, YingLi .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2015, 53 (04) :1947-1957

[7] Scene recognition with objectness [J].

Cheng, Xiaojuan ;

Lu, Jiwen ;

Feng, Jianjiang ;

Yuan, Bo ;

Zhou, Jie .

PATTERN RECOGNITION, 2018, 74 :474-487

[8]

Cimpoi M, 2015, PROC CVPR IEEE, P3828, DOI 10.1109/CVPR.2015.7299007

[9]

Csurka Gabriella, 2004, P WORKSH STAT LEARN, P1

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 6 7 8 →