Multiview Discrete Hashing for Scalable Multimedia Search

被引:82
作者
Shen, Xiaobo [1 ]
Shen, Fumin [2 ]
Liu, Li [3 ]
Yuan, Yun-Hao [4 ]
Liu, Weiwei [5 ]
Sun, Quan-Sen [6 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
[3] Northumbria Univ, Dept Comp Sci & Digital Technol, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
[4] Yangzhou Univ, Dept Comp Sci & Technol, Yangzhou 225000, Jiangsu, Peoples R China
[5] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
[6] Nanjing Univ Sci & Technol, Sch Comp & Engn, Nanjing 210094, Jiangsu, Peoples R China
基金
美国国家科学基金会;
关键词
Hashing; multi-view; multimedia search; SCALE; CODES;
D O I
10.1145/3178119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hashing techniques have recently gained increasing research interest in multimedia studies. Most existing hashing methods only employ single features for hash code learning. Multiview data with each view corresponding to a type of feature generally provides more comprehensive information. How to efficiently integrate multiple views for learning compact hash codes still remains challenging. In this article, we propose a novel unsupervised hashing method, dubbed multiview discrete hashing (MvDH), by effectively exploring multiview data. Specifically, MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views, during which spectral clustering is performed simultaneously. The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes, which are optimal for classification. An efficient alternating algorithm is developed to solve the proposed optimization problem with guaranteed convergence and low computational complexity. The binary codes are optimized via the discrete cyclic coordinate descent (DCC) method to reduce the quantization errors. Extensive experimental results on three large-scale benchmark datasets demonstrate the superiority of the proposed method over several state-of-the-art methods in terms of both accuracy and scalability.
引用
收藏
页数:21
相关论文
共 51 条
[1]  
[Anonymous], 2007, Caltech-256 Object Category Dataset
[2]  
[Anonymous], 2011, INT JOINT C ART INT, DOI DOI 10.5591/978-1-57735-516-8/IJCAI11-230
[3]  
[Anonymous], AAAI
[4]  
[Anonymous], 2009, P ACM INT C IMAGE VI
[5]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[6]  
Bertsekas Dimitri P, 1999, NONLINEAR PROGRAMMIN, V2
[7]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[8]  
Chen X., 2011, P 25 AAAI C ART INT, P313, DOI DOI 10.1109/CVPR.2016.425
[9]   Collective Matrix Factorization Hashing for Multimodal Data [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2083-2090
[10]  
Gionis A, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P518