A Generic Self-Supervised Framework of Learning Invariant Discriminative Features

被引:4
作者
Ntelemis, Foivos [1 ]
Jin, Yaochu [1 ,2 ]
Thomas, Spencer A. [1 ,3 ]
机构
[1] Univ Surrey, Dept Comp Sci, Guildford GU2 7XH, England
[2] Bielefeld Univ, Fac Technol, D-33619 Bielefeld, Germany
[3] Natl Phys Lab, Teddington TW11 0LW, England
基金
英国工程与自然科学研究理事会;
关键词
Training; Visualization; Perturbation methods; Computational modeling; Task analysis; Feature extraction; Data models; Deep neural models; feature extraction; regularized optimal transport; self-supervised learning (SSL); virtual adversarial training (VAT); DIMENSIONALITY;
D O I
10.1109/TNNLS.2023.3265607
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning (SSL) has become a popular method for generating invariant representations without the need for human annotations. Nonetheless, the desired invariant representation is achieved by utilizing prior online transformation functions on the input data. As a result, each SSL framework is customized for a particular data type, for example, visual data, and further modifications are required if it is used for other dataset types. On the other hand, autoencoder (AE), which is a generic and widely applicable framework, mainly focuses on dimension reduction and is not suited for learning invariant representation. This article proposes a generic SSL framework based on a constrained self-labeling assignment process that prevents degenerate solutions. Specifically, the prior transformation functions are replaced with a self-transformation mechanism, derived through an unsupervised training process of adversarial training, for imposing invariant representations. Via the self-transformation mechanism, pairs of augmented instances can be generated from the same input data. Finally, a training objective based on contrastive learning is designed by leveraging both the self-labeling assignment and the self-transformation mechanism. Despite the fact that the self-transformation process is very generic, the proposed training strategy outperforms a majority of state-of-the-art representation learning methods based on AE structures. To validate the performance of our method, we conduct experiments on four types of data, namely visual, audio, text, and mass spectrometry data and compare them in terms of four quantitative metrics. Our comparison results demonstrate that the proposed method is effective and robust in identifying patterns within the tested datasets.
引用
收藏
页码:12938 / 12952
页数:15
相关论文
共 65 条
[1]  
Amrani E., 2022, P 17 EUR C COMP VIS
[2]  
[Anonymous], 2005, P 22 INT C MACH LEAR, DOI DOI 10.1145/1102351.1102389
[3]  
Baevski A., 2020, ARXIV
[4]  
Belghazi MI, 2018, PR MACH LEARN RES, V80
[5]  
Bengio Y, 2014, PR MACH LEARN RES, V32, P226
[6]   FCM - THE FUZZY C-MEANS CLUSTERING-ALGORITHM [J].
BEZDEK, JC ;
EHRLICH, R ;
FULL, W .
COMPUTERS & GEOSCIENCES, 1984, 10 (2-3) :191-203
[7]  
Bojanowski P, 2017, 34 INT C MACHINE LEA, V70
[8]  
Brock A., 2019, INT C LEARNING REPRE
[9]  
Caron M., 2020, UNSUPERVISED LEARNIN, P9912
[10]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156