Collaborative Quantization for Cross-Modal Similarity Search

被引:45
作者
Zhang, Ting [1 ]
Wang, Jingdong [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
来源
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年
关键词
D O I
10.1109/CVPR.2016.224
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal similarity search is a problem about designing a search system supporting querying across content modalities, e.g., using an image to search for texts or using a text to search for images. This paper presents a compact coding solution for efficient search, with a focus on the quantization approach which has already shown the superior performance over the hashing solutions in the single-modal similarity search. We propose a cross-modal quantization approach, which is among the early attempts to introduce quantization into cross-modal search. The major contribution lies in jointly learning the quantizers for both modalities through aligning the quantized representations for each pair of image and text belonging to a document. In addition, our approach simultaneously learns the common space for both modalities in which quantization is conducted to enable efficient and effective search using the Euclidean distance computed in the common space with fast distance table lookup. Experimental results compared with several competitive algorithms over three benchmark datasets demonstrate that the proposed approach achieves the state-of-the-art performance.
引用
收藏
页码:2036 / 2045
页数:10
相关论文
共 31 条
[1]  
[Anonymous], J PHARM ALLIED HLTH, DOI [10.3923/jpahs.2011.1.15, DOI 10.3923/JPAHS.2011.1.15]
[2]  
[Anonymous], 2014, PR MACH LEARN RES
[3]  
[Anonymous], 2010, P 18 ACM INT C MULT
[4]  
[Anonymous], PROBABILISTIC MODEL
[5]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[6]  
Chua T.-S., 2009, P ACM INT C IM VID R, P1
[7]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[8]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[9]   Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval [J].
Gong, Yunchao ;
Lazebnik, Svetlana ;
Gordo, Albert ;
Perronnin, Florent .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (12) :2916-2929
[10]   Product Quantization for Nearest Neighbor Search [J].
Jegou, Herve ;
Douze, Matthijs ;
Schmid, Cordelia .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) :117-128