COSDA: Covariance regularized semantic data augmentation for self-supervised visual representation learning

被引:0
|
作者
Chen, Hui
Ma, Yongqiang
Jiang, Jingjing
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Natl Engn Res Ctr Visual Informat & Applicat, Natl Key Lab Human Machine Hybrid Augmented Intell, Xian 710049, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised visual representation learning; Contrastive learning; Semantic data augmentation;
D O I
10.1016/j.knosys.2025.113080
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent contrastive learning-based self-supervised learning has seen significant improvements through employing an extensive data augmentation strategy, particularly focusing on the generation of positive pairs. However, the current techniques primarily operate at the pixel level, confined to basic spatial and color transformations, thus lacking the capability to incorporate more complex semantic alterations such as object repositioning, rotation, or color modification within the image. Consequently, the resultant positive pairs are less informative for learning features that are invariant to such semantic variations. In this work, we introduce a new methodology termed COvariance Regularized Semantic Data Augmentation (COSDA), designed to generate a diverse collection of feature embeddings that serve as positives relative to an anchor point. These generated features are intended to possess distinct semantic characteristics from the anchor point while maintaining consistent category identities, accomplished through Gaussian sampling in the deep feature space. By theoretically analyzing the scenario where the number of generated positive features approaches infinity, we establish an upper bound for the InfoNCE loss and optimize this bound without explicit feature generation. Rigorous experimental assessments, conducted on datasets of varying scales, alongside downstream tasks encompassing detection and segmentation, corroborate the efficacy of COSDA.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Self-supervised Visual Representation Learning for Histopathological Images
    Yang, Pengshuai
    Hong, Zhiwei
    Yin, Xiaoxu
    Zhu, Chengzhan
    Jiang, Rui
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 47 - 57
  • [22] Self-supervised representation learning by predicting visual permutations
    Zhao, Qilu
    Dong, Junyu
    KNOWLEDGE-BASED SYSTEMS, 2020, 210
  • [23] Distribution regularized self-supervised learning for domain adaptation of semantic segmentation
    Iqbal, Javed
    Rawal, Hamza
    Hafiz, Rehan
    Chi, Yu-Tseh
    Ali, Mohsen
    Image and Vision Computing, 2022, 124
  • [24] Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning
    Zaiem, Salah
    Parcollet, Titouan
    Essid, Slim
    INTERSPEECH 2022, 2022, : 669 - 673
  • [25] Self-Supervised Representation Distribution Learning for Reliable Data Augmentation in Histopathology WSI Classification
    Tang, Kunming
    Jiang, Zhiguo
    Wu, Kun
    Shi, Jun
    Xie, Fengying
    Wang, Wei
    Wu, Haibo
    Zheng, Yushan
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (01) : 462 - 474
  • [26] Towards Pointsets Representation Learning via Self-Supervised Learning and Set Augmentation
    Arsomngern, Pattaramanee
    Long, Cheng
    Suwajanakorn, Supasorn
    Nutanong, Sarana
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 1201 - 1216
  • [27] Unbiased and augmentation-free self-supervised graph representation learning
    Liu, Ruyue
    Yin, Rong
    Liu, Yong
    Wang, Weiping
    PATTERN RECOGNITION, 2024, 149
  • [28] Self-supervised Representation Learning Using 360° Data
    Li, Junnan
    Liu, Jianquan
    Wong, Yongkang
    Nishimura, Shoji
    Kankanhalli, Mohan S.
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 998 - 1006
  • [29] Self-Supervised Visual Representation Learning from Hierarchical Grouping
    Zhang, Xiao
    Maire, Michael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [30] Audio-visual self-supervised representation learning: A survey
    Alsuwat, Manal
    Al-Shareef, Sarah
    Alghamdi, Manal
    NEUROCOMPUTING, 2025, 634