SDDA: A progressive self-distillation with decoupled alignment for multimodal image–text classification

被引:0
|
作者
Chen, Xiaohao [1 ]
Shuai, Qianjun [1 ]
Hu, Feng [1 ]
Cheng, Yongqiang [2 ]
机构
[1] College of Information and Communication Engineering, Communication University of China, Beijing,100024, China
[2] Faculty of Technology, University of Sunderland, Sunderland,SR6 0DD, United Kingdom
关键词
Image classification;
D O I
10.1016/j.neucom.2024.128794
中图分类号
学科分类号
摘要
Multimodal image–text classification endeavors to deduce the correct category based on the information encapsulated in image–text pairs. Despite the commendable performance achieved by current image–text methodologies, the intrinsic multimodal heterogeneity persists as a challenge, with the contributions from diverse modalities exhibiting considerable variance. In this study, we address this issue by introducing a novel decoupled multimodal Self-Distillation (SDDA) approach, aimed at facilitating fine-grained alignment of shared and private features of image–text features in a low-dimensional space, thereby reducing information redundancy. Specifically, each modality representation is decoupled in an autoregressive manner into two segments within a modality-irrelevant/exclusive space. SDDA imparts additional knowledge transfer to each decoupled segment via self-distillation, while also offering flexible, richer multimodal knowledge supervision for unimodal features. Multimodal classification experiments conducted on two publicly available benchmark datasets verified the efficacy of the algorithm, demonstrating that SDDA surpasses the state-of-the-art baselines. © 2024 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] Tolerant Self-Distillation for image classification
    Liu, Mushui
    Yu, Yunlong
    Ji, Zhong
    Han, Jungong
    Zhang, Zhongfei
    NEURAL NETWORKS, 2024, 174
  • [2] Image classification based on self-distillation
    Yuting Li
    Linbo Qing
    Xiaohai He
    Honggang Chen
    Qiang Liu
    Applied Intelligence, 2023, 53 : 9396 - 9408
  • [3] Image classification based on self-distillation
    Li, Yuting
    Qing, Linbo
    He, Xiaohai
    Chen, Honggang
    Liu, Qiang
    APPLIED INTELLIGENCE, 2023, 53 (08) : 9396 - 9408
  • [4] SIMPLE SELF-DISTILLATION LEARNING FOR NOISY IMAGE CLASSIFICATION
    Sasaya, Tenta
    Watanabe, Takashi
    Ida, Takashi
    Ono, Toshiyuki
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 795 - 799
  • [5] A Self-distillation Lightweight Image Classification Network Scheme
    Ni S.
    Ma X.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (06): : 66 - 71
  • [6] Masked Self-Distillation Domain Adaptation for Hyperspectral Image Classification
    Fang, Zhuoqun
    He, Wenqiang
    Li, Zhaokui
    Du, Qian
    Chen, Qiusheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [7] TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
    Jol, Sanghyun
    Ryu, Soohyun
    Kim, Sungyub
    Yang, Eunho
    Kim, Kyungsu
    COMPUTER VISION - ECCV 2024, PT LXXXI, 2025, 15139 : 341 - 357
  • [8] Dynamic image super-resolution via progressive contrastive self-distillation
    Zhang, Zhizhong
    Xie, Yuan
    Zhang, Chong
    Wang, Yanbo
    Qu, Yanyun
    Lin, Shaohui
    Ma, Lizhuang
    Tian, Qi
    PATTERN RECOGNITION, 2024, 153
  • [9] A Feature Map Fusion Self-Distillation Scheme for Image Classification Networks
    Qin, Zhenkai
    Ni, Shuiping
    Zhu, Mingfu
    Jia, Yue
    Liu, Shangxin
    Chen, Yawei
    ELECTRONICS, 2025, 14 (01):
  • [10] Towards Elastic Image Super-Resolution Network via Progressive Self-distillation
    Yu, Xin'an
    Zhang, Dongyang
    Liu, Cencen
    Dong, Qiang
    Duan, Guiduo
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII, 2025, 15038 : 137 - 150