Deep Multi-Modal Hashing With Semantic Enhancement for Multi-Label Micro-Video Retrieval

被引:1
|
作者
Jing, Peiguang [1 ]
Sun, Haoyi [2 ]
Nie, Liqiang [3 ]
Li, Yun [4 ,5 ]
Su, Yuting [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Tianjin Univ, Sch Future Technol, Tianjin 300072, Peoples R China
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518055, Peoples R China
[4] Guangxi Univ Finance & Econ, Sch Big Data & Artificial Intelligence, Guangxi 530001, Peoples R China
[5] Guangxi Key Lab Big Data Finance & Econ, Nanning 530001, Guangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Hash functions; Encoding; Representation learning; Convolutional neural networks; Quantization (signal); Kernel; Deep hashing; micro-video retrieval; multi-label; multi-modality; MAXIMUM-LIKELIHOOD; QUANTIZATION;
D O I
10.1109/TKDE.2023.3337077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pressing need for low storage and high efficiency has significantly propelled the advancement of deep hashing techniques in the realm of large-scale search and retrieval tasks. As one of the most prevailing forms of user-generated contents, micro-videos usually represent more complicated multi-modal behaviors that are further challenged in multi-label retrieval. Existing multi-modal hashing methods tend to prioritize the complementarity and consistency in multi-modal fusion, while neglecting the completeness problem. In this paper, we propose a deep multi-modal hashing with semantic enhancement (DMHSE) method that effectively integrates complete multi-modal representation learning with discriminative binary coding by means of collaboration between two distinct encoders, FoldCoder and HashCoder. FoldCoder translates latent multi-modal representation learning to a degradation process through mimicking data transmitting. Further, it incorporates a prompt learning paradigm to maximize the utilization of multi-label semantics for guiding representation learning. HashCoder combines pairwise and central constraints to ensure more discriminative hashing results. Pairwise constraint preserves the original local relevance structure, while central constraint tackles the problem of semantic ambiguity in multi-label data by leveraging the global label distribution. Experimental results demonstrate that DMHSE achieves superior performance in multi-label micro-video retrieval tasks.
引用
收藏
页码:5080 / 5091
页数:12
相关论文
共 50 条
  • [31] A Semantic-Preserving Deep Hashing Model for Multi-Label Remote Sensing Image Retrieval
    Cheng, Qimin
    Huang, Haiyan
    Ye, Lan
    Fu, Peng
    Gan, Deqiao
    Zhou, Yuzhuo
    REMOTE SENSING, 2021, 13 (24)
  • [32] Bit-aware Semantic Transformer Hashing for Multi-modal Retrieval
    Tan, Wentao
    Zhu, Lei
    Guan, Weili
    Li, Jingjing
    Cheng, Zhiyong
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 982 - 991
  • [33] Multimodal Progressive Modulation Network for Micro-Video Multi-Label Classification
    Jing, Peiguang
    Zhao, Xuan
    Fan, Fugui
    Yang, Fan
    Li, Yun
    Su, Yuting
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10134 - 10144
  • [34] Multi-label enhancement based self-supervised deep cross-modal hashing
    Zou, Xitao
    Wu, Song
    Bakker, Erwin M.
    Wang, Xinzhi
    Neurocomputing, 2022, 467 : 138 - 162
  • [35] Multi-label enhancement based self-supervised deep cross-modal hashing
    Zou, Xitao
    Wu, Song
    Bakker, Erwin M.
    Wang, Xinzhi
    NEUROCOMPUTING, 2022, 467 : 138 - 162
  • [36] Deep Co-Image-Label Hashing for Multi-Label Image Retrieval
    Shen, Xiaobo
    Dong, Guohua
    Zheng, Yuhui
    Lan, Long
    Tsang, Ivor
    Sun, Quan-Sen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1116 - 1126
  • [37] Deep Co-Image-Label Hashing for Multi-Label Image Retrieval
    Shen, Xiaobo
    Dong, Guohua
    Zheng, Yuhui
    Lan, Long
    Tsang, Ivor
    Sun, Quan-Sen
    IEEE Transactions on Multimedia, 2022, 24 : 1116 - 1126
  • [38] Hadamard matrix-guided multi-modal hashing for multi-modal retrieval
    Yu, Jun
    Huang, Wei
    Li, Zuhe
    Shu, Zhenqiu
    Zhu, Liang
    DIGITAL SIGNAL PROCESSING, 2022, 130
  • [39] Multi-Label Deep Sparse Hashing
    Liong, Venice Erin
    Lu, Jiwen
    Tan, Yap-Peng
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [40] Deep adversarial multi-label cross-modal hashing algorithm
    Xiaohan Yang
    Zhen Wang
    Wenhao Liu
    Xinyi Chang
    Nannan Wu
    International Journal of Multimedia Information Retrieval, 2023, 12