Content based image retrieval with sparse representations and local feature descriptors: A comparative study

被引:53
作者
Celik, Ceyhun [1 ]
Bilge, Hasan Sakir [2 ]
机构
[1] Gazi Univ, Dept Comp Engn, Ankara, Turkey
[2] Gazi Univ, Dept Elect Elect Engn, Ankara, Turkey
关键词
Content based image retrieval; Local feature descriptor; Sparse representation; Dictionary learning; Coefficient learning; FACE-RECOGNITION; OPTIMIZATION; COORDINATE; ALGORITHM; FRAMEWORK; MODEL;
D O I
10.1016/j.patcog.2017.03.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Content Based Image Retrieval (CBIR) has been widely studied in the last two decades. Unlike text based image retrieval techniques, visual properties of images are used to obtain high level semantic information in CBIR. There is a gap between low level features and high level semantic information. This is called semantic gap and it is the most important problem in CBIR. The visual properties were extracted from low level features such as color, shape, texture and spatial information in early days. Local Feature Descriptors (LFDs) are more successful to increase performance of CBIR system. Then, a semantic bridge is built with high level semantic information. Sparse Representations (SRs) have become popular to achieve this aim in the last years. In this study, CBIR models that use LFDs and SRs in literature are investigated in detail. The SRs and LFD extraction algorithms are tested and compared within a CBIR framework for different scenarios. Scale Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF), Histograms of Oriented Gradients (HoG), Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) are used to extract LFDs from images. Random Features, K-Means and K-Singular Value Decomposition (K-SVD) algorithms are used for dictionary learning and Orthogonal Matching Pursuit (OMP), Homotopy, Lasso, Elastic Net, Parallel Coordinate Descent (PCD) and Separable Surrogate Function (SSF) are used for coefficient learning. Finally, three methods recently proposed in literature (Online Dictionary Learning (ODL), Locality-constrained Linear Coding (LLC) and Feature-based Sparse Representation (FBSR)) are also tested and compared with our framework results. All test results are presented and discussed. As a conclusion, the most successful approach in our framework is to use LLC for Coil20 data set and FBSR for Corel1000 data set. We obtain 89% and 58% Mean Average Precision (MAP) for Coil20 and Corel1000, respectively. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 94 条
[1]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]  
Alemu Yihun, 2009, Proceedings of the 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing. IIH-MSP 2009, P681, DOI 10.1109/IIH-MSP.2009.159
[3]  
[Anonymous], 2012, BMVC 2012 P BRIT MAC
[4]  
[Anonymous], 2011, P 28 INT C MACHINE L
[5]  
[Anonymous], 1993, SIGN SYST COMP 1993
[6]  
[Anonymous], PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2011.5995499
[7]  
[Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5539963
[8]  
[Anonymous], 1996, Tech. Rep. CUCS-006-96
[9]   Colour invariants under a non-linear photometric camera model and their application to face recognition from video [J].
Arandjelovic, Ognjen .
PATTERN RECOGNITION, 2012, 45 (07) :2499-2509
[10]   Optimization with Sparsity-Inducing Penalties [J].
Bach, Francis ;
Jenatton, Rodolphe ;
Mairal, Julien ;
Obozinski, Guillaume .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 4 (01) :1-106