Source-Free Image-Text Matching via Uncertainty-Aware Learning

被引:0
|
作者
Tian, Mengxiao [1 ,2 ]
Yang, Shuo [3 ]
Wu, Xinxiao [1 ,2 ]
Jia, Yunde [3 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing 100081, Peoples R China
[2] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
[3] Shenzhen MSU BIT Univ, Guangdong Prov Lab Machine Percept & Intelligent C, Shenzhen 518172, Peoples R China
关键词
Adaptation models; Uncertainty; Noise measurement; Data models; Training; Noise; Visualization; Measurement uncertainty; Computational modeling; Testing; Image-text matching; source-free adaptation; uncertainty-aware learning;
D O I
10.1109/LSP.2024.3488521
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
When applying a trained image-text matching model to a new scenario, the performance may largely degrade due to domain shift, which makes it impractical in real-world applications. In this paper, we make the first attempt on adapting the image-text matching model well-trained on a labeled source domain to an unlabeled target domain in the absence of source data, namely, source-free image-text matching. This task is challenging since it has no direct access to the source data when learning to reduce the doma in shift. To address this challenge, we propose a simple yet effective method that introduces uncertainty-aware learning to generate high-quality pseudo-pairs of image and text for target adaptation. Specifically, starting with using the pre-trained source model to retrieve several top-ranked image-text pairs from the target domain as pseudo-pairs, we then model uncertainty of each pseudo-pair by calculating the variance of retrieved texts (resp. images) given the paired image (resp. text) as query, and finally incorporate the uncertainty into an objective function to down-weight noisy pseudo-pairs for better training, thereby enhancing adaptation. This uncertainty-aware training approach can be generally applied on all existing models. Extensive experiments on the COCO and Flickr30K datasets demonstrate the effectiveness of the proposed method.
引用
收藏
页码:3059 / 3063
页数:5
相关论文
共 50 条
  • [21] Uncertainty-Aware Sparse Transformer Network for Single-Image Deraindrop
    Fu, Bo
    Jiang, Yunyun
    Wang, Di
    Gao, Jiaxin
    Wang, Cong
    Li, Ximing
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [22] Incremental Pedestrian Attribute Recognition via Dual Uncertainty-Aware Pseudo-Labeling
    Li, Da
    Zhang, Zhang
    Shan, Caifeng
    Wang, Liang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 2622 - 2636
  • [23] Bi-Attention enhanced representation learning for image-text matching
    Tian, Yumin
    Ding, Aqiang
    Wang, Di
    Luo, Xuemei
    Wan, Bo
    Wang, Yifeng
    PATTERN RECOGNITION, 2023, 140
  • [24] Self-attention guided representation learning for image-text matching
    Qi, Xuefei
    Zhang, Ying
    Qi, Jinqing
    Lu, Huchuan
    NEUROCOMPUTING, 2021, 450 : 143 - 155
  • [25] Learning Source-Free Domain Adaptation for Infrared Small Target Detection
    Jin, Hongxu
    Chen, Baiyang
    Lu, Qianwen
    Tao, Qingchuan
    Li, Yongxiang
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1121 - 1125
  • [26] Learning Fragment Self-Attention Embeddings for Image-Text Matching
    Wu, Yiling
    Wang, Shuhui
    Song, Guoli
    Huang, Qingming
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2088 - 2096
  • [27] Cross-modal multi-relationship aware reasoning for image-text matching
    Jin Zhang
    Xiaohai He
    Linbo Qing
    Luping Liu
    Xiaodong Luo
    Multimedia Tools and Applications, 2022, 81 : 12005 - 12027
  • [28] Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control
    Saviolo, Alessandro
    Frey, Jonathan
    Rathod, Abhishek
    Diehl, Moritz
    Loianno, Giuseppe
    IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 1273 - 1291
  • [29] Context-Aware Multi-View Summarization Network for Image-Text Matching
    Qu, Leigang
    Liu, Meng
    Cao, Da
    Nie, Liqiang
    Tian, Qi
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1047 - 1055
  • [30] Cross-modal multi-relationship aware reasoning for image-text matching
    Zhang, Jin
    He, Xiaohai
    Qing, Linbo
    Liu, Luping
    Luo, Xiaodong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (09) : 12005 - 12027