String representations and distances in deep Convolutional Neural Networks for image classification

被引:51
作者
Barat, Cecile [1 ]
Ducottet, Christophe [1 ]
机构
[1] Univ St Etienne, Univ Lyon, Lab Hubert Curien, CNRS,UMR 5516, F-42000 St Etienne, France
关键词
Convolutional Neural Network; String representation; Edit distance; Image classification; EDIT DISTANCE;
D O I
10.1016/j.patcog.2016.01.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in image classification mostly rely on the use of powerful local features combined with an adapted image representation. Although Convolutional Neural Network (CNN) features learned from ImageNet. were shown to be generic and very efficient, they still lack of flexibility to take into account variations in the spatial layout of visual elements. In this paper, we investigate the use of structural representations on top of pretrained CNN features to improve image classification. Images are represented as strings of CNN features. Similarities between such representations are computed using two new edit distance variants adapted to the image classification domain. Our algorithms have been implemented and tested on several challenging datasets, 15Scenes, Caltech101, Pascal VOC 2007 and MIT indoor. The results show that our idea of using structural string representations and distances clearly improves the classification performance over standard approaches based on CNN and SVM with linear kernel, as well as other recognized methods of the literature. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:104 / 115
页数:12
相关论文
共 34 条
  • [1] Ahmed M., 2013, INT J SCI ENG RES, V4
  • [2] [Anonymous], 2014, PROC IEEE C COMPUT V
  • [3] [Anonymous], 2011, BMVC 2011BRITISH MAC, DOI DOI 10.1109/NCC.2011.5734769
  • [4] [Anonymous], 2014, ARXIV14031840
  • [5] [Anonymous], 2014, ABS14053531 CORR
  • [6] [Anonymous], 2013, INT C LEARN REPR ICL
  • [7] [Anonymous], P 19 ACM INT C MULT
  • [8] [Anonymous], ADV NEURAL INFORM PR
  • [9] Video event classification using string kernels
    Ballan, Lamberto
    Bertini, Marco
    Del Bimbo, Alberto
    Serra, Giuseppe
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 48 (01) : 69 - 87
  • [10] Barat C, 2010, LECT NOTES ARTIF INT, V6321, P72, DOI 10.1007/978-3-642-15880-3_11