Grouped Contrastive Learning of Self-Supervised Sentence Representation

被引:1
|
作者
Wang, Qian [1 ]
Zhang, Weiqi [1 ]
Lei, Tianyi [1 ]
Peng, Dezhong [1 ,2 ,3 ]
机构
[1] Sichuan Univ, Coll Comp Sci & Technol, Chengdu 610065, Peoples R China
[2] Chengdu Ruibei Yingte Informat Technol Co Ltd, Chengdu 610054, Peoples R China
[3] Sichuan Zhiqian Technol Co Ltd, Chengdu 610065, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 17期
关键词
contrastive learning; self-attention; data augmentation; grouped representation; unsupervised learning;
D O I
10.3390/app13179873
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper proposes a method called Grouped Contrastive Learning of self-supervised Sentence Representation (GCLSR), which can learn an effective and meaningful representation of sentences. Previous works maximize the similarity between two vectors to be the objective of contrastive learning, suffering from the high-dimensionality of the vectors. In addition, most previous works have adopted discrete data augmentation to obtain positive samples and have directly employed a contrastive framework from computer vision to perform contrastive training, which could hamper contrastive training because text data are discrete and sparse compared with image data. To solve these issues, we design a novel framework of contrastive learning, i.e., GCLSR, which divides the high-dimensional feature vector into several groups and respectively computes the groups' contrastive losses to make use of more local information, eventually obtaining a more fine-grained sentence representation. In addition, in GCLSR, we design a new self-attention mechanism and both a continuous and a partial-word vector augmentation (PWVA). For the discrete and sparse text data, the use of self-attention could help the model focus on the informative words by measuring the importance of every word in a sentence. By using the PWVA, GCLSR can obtain high-quality positive samples used for contrastive learning. Experimental results demonstrate that our proposed GCLSR achieves an encouraging result on the challenging datasets of the semantic textual similarity (STS) task and transfer task.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Self-supervised contrastive representation learning for semantic segmentation
    Liu B.
    Cai H.
    Wang Y.
    Chen X.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (01): : 125 - 134
  • [2] CONTRASTIVE SEPARATIVE CODING FOR SELF-SUPERVISED REPRESENTATION LEARNING
    Wang, Jun
    Lam, Max W. Y.
    Su, Dan
    Yu, Dong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3865 - 3869
  • [3] CONTRASTIVE HEARTBEATS: CONTRASTIVE LEARNING FOR SELF-SUPERVISED ECG REPRESENTATION AND PHENOTYPING
    Wei, Crystal T.
    Hsieh, Ming-En
    Liu, Chien-Liang
    Tseng, Vincent S.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1126 - 1130
  • [4] A Survey on Contrastive Self-Supervised Learning
    Jaiswal, Ashish
    Babu, Ashwin Ramesh
    Zadeh, Mohammad Zaki
    Banerjee, Debapriya
    Makedon, Fillia
    TECHNOLOGIES, 2021, 9 (01)
  • [5] Contrastive Self-supervised Representation Learning Using Synthetic Data
    She, Dong-Yu
    Xu, Kun
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (04) : 556 - 567
  • [6] Partial contrastive point cloud self-supervised representation learning
    Zijun Cheng
    Yiguo Wang
    Scientific Reports, 15 (1)
  • [7] Contrastive Self-Supervised Learning With Smoothed Representation for Remote Sensing
    Jung, Heechul
    Oh, Yoonju
    Jeong, Seongho
    Lee, Chaehyeon
    Jeon, Taegyun
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [8] Self-supervised Segment Contrastive Learning for Medical Document Representation
    Abro, Waheed Ahmed
    Kteich, Hanane
    Bouraoui, Zied
    ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 : 312 - 321
  • [9] Contrastive Self-supervised Representation Learning Using Synthetic Data
    Dong-Yu She
    Kun Xu
    International Journal of Automation and Computing, 2021, 18 : 556 - 567
  • [10] Contrastive Self-supervised Representation Learning Using Synthetic Data
    Dong-Yu She
    Kun Xu
    International Journal of Automation and Computing , 2021, (04) : 556 - 567