Topic modeling and improvement of image representation for large-scale image retrieval

被引:15
作者
Nguyen Anh Tu [1 ]
Dong-Luong Dinh [1 ]
Rasel, Mostofa Kamal [1 ]
Lee, Young-Koo [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Sci & Engn, Yongin 446701, Gyeonggi Do, South Korea
关键词
Topic modeling; Probabilistic graphical model; Image retrieval; Image representation; Image coding; Bag-of-visual words; CLASSIFICATION; GEOMETRY; OBJECTS;
D O I
10.1016/j.ins.2016.05.029
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a new visual search system for finding similar images in a large database. However, there are a number of challenges regarding the robustness of the image representations and the efficiency of the retrieval framework. To tackle these challenges, we first propose an encoding technique based on soft-assignment of local features to convert an entire image into a single vector, which is a compact and discriminative representation. This encoded vector is suitable for most types of efficient indexing methods to produce an initial result. To compensate for the lack of incorporating geometric and object-related information during the encoding scheme, we then propose a probabilistic topic model to formalize the spatial structure among the local features. Moreover, the topic model allows us to effectively extract the object and background regions from the image. This is performed by a Markov Chain Monte Carlo algorithm for approximate inference. Finally, benefiting from the extracted objects in each image, we present a re-ranking scheme to automatically refine the initial search results. Our proposed retrieval framework has two major advantages: i) an aggregation strategy through soft-assignment improves the discriminative power of the representation, which has a determinative effect on the retrieval precision; and ii) the probabilistic latent topic model enables us to not only gain insight into the spatial structure of the image, but also handle a large variation in the object appearance. The experimental results from four benchmark datasets show that our approach provides competitive accuracy, and runs about ten times faster. Our studies also verify that proposed approach works effectively on large-scale databases of millions of images. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:99 / 120
页数:22
相关论文
共 56 条
[1]   An introduction to MCMC for machine learning [J].
Andrieu, C ;
de Freitas, N ;
Doucet, A ;
Jordan, MI .
MACHINE LEARNING, 2003, 50 (1-2) :5-43
[2]  
[Anonymous], 2006, 2006 IEEE COMP SOC C
[3]  
[Anonymous], 2009, INT C COMP VIS THEOR
[4]  
[Anonymous], 2012, P 20 ACM INT C MULTI
[5]   All about VLAD [J].
Arandjelovic, Relja ;
Zisserman, Andrew .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1578-1585
[6]   SURF: Speeded up robust features [J].
Bay, Herbert ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :404-417
[7]  
Beck A., 2009, CONVEX OPTIM SIGNAL
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]   Spatial-Bag-of-Features [J].
Cao, Yang ;
Wang, Changhu ;
Li, Zhiwei ;
Zhang, Liqing ;
Zhang, Lei .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :3352-3359
[10]  
Chemudugunta C., 2007, Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, V19, P241