Effective Image Retrieval via Multilinear Multi-Index Fusion

被引:16
作者
Zhang, Zhizhong [1 ,2 ]
Xie, Yuan [3 ]
Zhang, Wensheng [1 ,2 ]
Tian, Qi [4 ]
机构
[1] Chinese Acad Sci, Inst Automat, Res Ctr Precis Sensing & Control, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Shenzhen 518000, Peoples R China
[3] East China Normal Univ, Sch Comp Sci & Software Engn, Shanghai 200241, Peoples R China
[4] Huawei Noahs Ark Lab, Comp Vis, Shenzhen 518000, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Visualization; Image representation; Optimization; Buildings; Indexing; Image retrieval; multi-index fusion; tensor multi-rank; person re-identification; SCALE; FEATURES;
D O I
10.1109/TMM.2019.2915036
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-index fusion has demonstrated impressive performances in the retrieval task by integrating different visual representations in a unified framework. However, previous works mainly consider propagating similarities via a neighbor structure, ignoring the high-order information among different visual representations. In this paper, we propose a new multi-index fusion scheme for image retrieval. By formulating this procedure as a multilinear-based optimization problem, the complementary information hidden in different indexes can be explored more thoroughly. Specifically, we first build our multiple indexes from various visual representations. Then, a so-called index-specific functional matrix, which aims to propagate similarities, is introduced to update the original index. The functional matrices are then optimized in a unified tensor space to achieve a refinement, such that the relevant images can be pushed closer. The optimization problem can be efficiently solved by the augmented Lagrangian method with a theoretical convergence guarantee. Unlike the traditional multi-index fusion scheme, our approach embeds the multi-index subspace structure into the new indexes with sparse constraint and, thus, it has little additional memory consumption in the online query stage. Experimental evaluation on three benchmark datasets reveals that the proposed approach achieves state-of-the-art performance, that is, N-score 3.94 on UKBench, mAP 94.1 on Holiday, and 62.39 on Market-1501.
引用
收藏
页码:2878 / 2890
页数:13
相关论文
共 57 条
[1]  
[Anonymous], 2007, P IEEE C COMP VIS PA
[2]  
[Anonymous], P IEEE C COMP VIS PA
[3]  
[Anonymous], ARXIV161007126
[4]  
ARANDJELOVIC R, 2012, PROC CVPR IEEE, P2911, DOI DOI 10.1109/CVPR.2012.6248018
[5]  
Arandjelovic R, 2012, PROC CVPR IEEE, P2911, DOI 10.1109/CVPR.2012.6248018
[6]  
Babenko A., 2014, P EUR C COMPUT VIS
[7]   The Inverted Multi-Index [J].
Babenko, Artem ;
Lempitsky, Victor .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (06) :1247-1260
[8]   Looking beyond appearances: Synthetic training data for deep CNNs in re identification [J].
Barbosa, Igor Barros ;
Cristani, Marco ;
Caputo, Barbara ;
Rognhaugen, Aleksander ;
Theoharis, Theoharis .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 167 :50-62
[9]   Similarity Learning with Spatial Constraints for Person Re-identification [J].
Chen, Dapeng ;
Yuan, Zejian ;
Chen, Badong ;
Zheng, Nanning .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1268-1277
[10]   Context-Aware Discriminative Vocabulary Learning for Mobile Landmark Recognition [J].
Chen, Tao ;
Yap, Kim-Hui .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (09) :1611-1621