Hypergraph-Enhanced Hashing for Unsupervised Cross-Modal Retrieval via Robust Similarity Guidance

被引：10

作者：

Zhong, Fangming ^{[1
]}

Chu, Chenglong ^{[1
]}

Zhu, Zijie ^{[1
]}

Chen, Zhikui ^{[1
]}

机构：

[1] Dalian Univ Technol, Dalian, Liaoning, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Cross-modal retrieval; Unsupervised cross-modal hashing; Hypergraph learning; Similarity estimation;

D O I：

10.1145/3581783.3612116

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unsupervised cross-modal hashing retrieval across image and text modality is a challenging task because of the suboptimality of similarity guidance, i.e., the joint similarity matrix constructed by existing methods does not possess clear enough guiding significance. How to construct more robust similarity matrix is the key to solve this problem. The unsupervised cross-modal retrieval methods based on graph have a good performance in mining semantic information of input samples, but the graph hashing based on traditional affinity graph cannot capture the high-order semantic information of input samples effectively. In order to overcome the aforementioned limitations, this paper presents a novel hypergraph-based approach for unsupervised cross-modal retrieval that differs from previous works in two significant ways. Firstly, to address the ubiquitous redundant information present in current methods, this paper introduces a robust similarity matrix constructing method. Secondly, we propose a novel hypergraph enhanced module that produces embedding vectors by hypergraph convolution and attention mechanism for input data, capturing important high-order semantics. Our approach is evaluated on the NUS-WIDE and MIRFlickr datasets, and yields state-of-the-art performance for unsupervised cross-modal retrieval.

引用

页码：3517 / 3527

页数：11

共 45 条

[1]

[Anonymous], 2014, P INT ACM SIGIR C RE

[2]

[Anonymous], 2018, P INT JOINT C ART IN, DOI DOI 10.1097/TXD.0000000000000807

[3] Hypergraph convolution and hypergraph attention [J].

Bai, Song ;

Zhang, Feihu ;

Torr, Philip H. S. .

PATTERN RECOGNITION, 2021, 110

[4] HashNet: Deep Learning to Hash by Continuation [J].

Cao, Zhangjie ;

Long, Mingsheng ;

Wang, Jianmin ;

Yu, Philip S. .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5609-5618

[5]

Chua T., 2009, P ACM INT C IM VID R, P1

[6]

Dejie Yang, 2020, ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval, P44, DOI 10.1145/3372278.3390673

[7]

Ding Guiguang, 2014, COMPUTER VISION PATT

[8]

Ding K., 2020, P 2020 C EMP METH NA

[9]

Du Yongchao, 2023, ACM T MULTIMEDIA COM

[10]

Feng YF, 2019, AAAI CONF ARTIF INTE, P3558

← 1 2 3 4 5 →