Source-Free Image-Text Matching via Uncertainty-Aware Learning

被引:0
|
作者
Tian, Mengxiao [1 ,2 ]
Yang, Shuo [3 ]
Wu, Xinxiao [1 ,2 ]
Jia, Yunde [3 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing 100081, Peoples R China
[2] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
[3] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
关键词
Adaptation models; Uncertainty; Noise measurement; Data models; Training; Noise; Visualization; Measurement uncertainty; Computational modeling; Testing; Image-text matching; source-free adaptation; uncertainty-aware learning;
D O I
10.1109/LSP.2024.3488521
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
When applying a trained image-text matching model to a new scenario, the performance may largely degrade due to domain shift, which makes it impractical in real-world applications. In this paper, we make the first attempt on adapting the image-text matching model well-trained on a labeled source domain to an unlabeled target domain in the absence of source data, namely, source-free image-text matching. This task is challenging since it has no direct access to the source data when learning to reduce the doma in shift. To address this challenge, we propose a simple yet effective method that introduces uncertainty-aware learning to generate high-quality pseudo-pairs of image and text for target adaptation. Specifically, starting with using the pre-trained source model to retrieve several top-ranked image-text pairs from the target domain as pseudo-pairs, we then model uncertainty of each pseudo-pair by calculating the variance of retrieved texts (resp. images) given the paired image (resp. text) as query, and finally incorporate the uncertainty into an objective function to down-weight noisy pseudo-pairs for better training, thereby enhancing adaptation. This uncertainty-aware training approach can be generally applied on all existing models. Extensive experiments on the COCO and Flickr30K datasets demonstrate the effectiveness of the proposed method.
引用
收藏
页码:3059 / 3063
页数:5
相关论文
共 50 条
  • [31] Robust Tracking via Uncertainty-Aware Semantic Consistency
    Ma, Jie
    Lan, Xiangyuan
    Zhong, Bineng
    Li, Guorong
    Tang, Zhenjun
    Li, Xianxian
    Ji, Rongrong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1740 - 1751
  • [32] Asymmetric Polysemous Reasoning for Image-Text Matching
    Zhang, Hongping
    Yang, Ming
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1013 - 1022
  • [33] Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching
    Li, Zheng
    Guo, Caili
    Wang, Xin
    Feng, Zerun
    Du, Zhongtian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1921 - 1935
  • [34] IMAGE-TEXT MATCHING WITH SHARED SEMANTIC CONCEPTS
    Miao Lanxin
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,
  • [35] Fusion layer attention for image-text matching
    Wang, Depeng
    Wang, Liejun
    Song, Shiji
    Huang, Gao
    Guo, Yuchen
    Cheng, Shuli
    Ao, Naixiang
    Du, Anyu
    NEUROCOMPUTING, 2021, 442 : 249 - 259
  • [36] A Lightweight Multi-Grained Image-Text Retrieval Paradigm via Cascaded Representation Learning and Parameter-Free Feature Aggregation
    Lu, Chenyu
    Zhang, Nan
    Sun, Shiliang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13584 - 13595
  • [37] Uncertainty-Aware Deep Learning Methods for Robust Diabetic Retinopathy Classification
    Jaskari, Joel
    Sahlsten, Jaakko
    Damoulas, Theodoros
    Knoblauch, Jeremias
    Sarkka, Simo
    Karkkainen, Leo
    Hietala, Kustaa
    Kaski, Kimmo K.
    IEEE ACCESS, 2022, 10 : 76669 - 76681
  • [38] A Mutually Textual and Visual Refinement Network for Image-Text Matching
    Pang, Shanmin
    Zeng, Yueyang
    Zhao, Jiawei
    Xue, Jianru
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7555 - 7566
  • [39] Region Reinforcement Network With Topic Constraint for Image-Text Matching
    Wu, Jie
    Wu, Chunlei
    Lu, Jing
    Wang, Leiquan
    Cui, Xuerong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (01) : 388 - 397
  • [40] Giving Text More Imagination Space for Image-text Matching
    Dong, Xinfeng
    Han, Longfei
    Zhang, Dingwen
    Liu, Li
    Han, Junwei
    Zhang, Huaxiang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6359 - 6368