SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval

被引:46
作者
Li, Chuan-Xiang [1 ]
Chen, Zhen-Duo [1 ]
Zhang, Peng-Fei [1 ]
Luo, Xin [1 ]
Nie, Liqiang [2 ]
Zhang, Wei [3 ]
Xu, Xin-Shun [1 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
[2] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Peoples R China
[3] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China
来源
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18) | 2018年
基金
中国国家自然科学基金;
关键词
Cross-Modal Retrieval; Hashing; Matrix Factorization; Discrete Optimization; IMAGE; QUANTIZATION;
D O I
10.1145/3240508.3240547
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In recent years, many hashing methods have been proposed for the cross-modal retrieval task. However, there are still some issues that need to be further explored. For example, some of them relax the binary constraints to generate the hash codes, which may generate large quantization error. Although some discrete schemes have been proposed, most of them are time-consuming. In addition, most of the existing supervised hashing methods use an n x n similarity matrix during the optimization, making them unscalable. To address these issues, in this paper, we present a novel supervised cross-modal hashing method-Scalable disCRete mATrix faCtorization Hashing, SCRATCH for short. It leverages the collective matrix factorization on the kernelized features and the semantic embedding with labels to find a latent semantic space to preserve the intra- and inter-modality similarities. In addition, it incorporates the label matrix instead of the similarity matrix into the loss function. Based on the proposed loss function and the iterative optimization algorithm, it can learn the hash functions and binary codes simultaneously. Moreover, the binary codes can be generated discretely, reducing the quantization error generated by the relaxation scheme. Its time complexity is linear to the size of the dataset, making it scalable to large-scale datasets. Extensive experiments on three benchmark datasets, namely, Wiki, MIRFlickr-25K, and NUS-WIDE, have verified that our proposed SCRATCH model outperforms several state-of-the-art unsupervised and supervised hashing methods for cross-modal retrieval.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 51 条
[41]  
Xia RK, 2014, AAAI CONF ARTIF INTE, P2156
[42]  
Xu XS, 2016, 2016 9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016), P17, DOI 10.1109/CISP-BMEI.2016.7852675
[43]   Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval [J].
Xu, Xing ;
Shen, Fumin ;
Yang, Yang ;
Shen, Heng Tao ;
Li, Xuelong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (05) :2494-2507
[44]   Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval [J].
Yan, Ting-Kun ;
Xu, Xin-Shun ;
Guo, Shanqing ;
Huang, Zi ;
Wang, Xiao-Lin .
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, :1271-1280
[45]   Discrete Multi-view Hashing for Effective Image Retrieval [J].
Yang, Rui ;
Shi, Yuliang ;
Xu, Xin-Shun .
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, :180-188
[46]   Zero-Shot Hashing via Transferring Supervised Knowledge [J].
Yang, Yang ;
Luo, Yadan ;
Chen, Weilun ;
Shen, Fumin ;
Shao, Jie ;
Shen, Heng Tao .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :1286-1295
[47]  
Zhang DQ, 2014, AAAI CONF ARTIF INTE, P2177
[48]   Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing [J].
Zhang, Hanwang ;
Wang, Meng ;
Hong, Richang ;
Chua, Tat-Seng .
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, :781-790
[49]   Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval [J].
Zhang, Peng-Fei ;
Li, Chuan-Xiang ;
Liu, Meng-Yuan ;
Nie, Liqiang ;
Xu, Xin-Shun .
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :1762-1770
[50]  
Zhen Y., 2012, P 18 ACM SIGKDD INT, P940, DOI DOI 10.1145/2339530.2339678