MoCoUTRL: a momentum contrastive framework for unsupervised text representation learning

被引:0
作者
Zou, Ao [1 ]
Hao, Wenning [1 ]
Jin, Dawei [1 ]
Chen, Gang [1 ]
Sun, Feiyan [1 ,2 ]
机构
[1] Army Engn Univ PLA, Command & Control Engn Coll, Nanjing, Peoples R China
[2] Jinling Inst Technol, Nanjing, Peoples R China
关键词
Natural language processing; text representation learning; momentum contrast; alignment; uniformity;
D O I
10.1080/09540091.2023.2221406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents MoCoUTRL: a Momentum Contrastive Framework for Unsupervised Text Representation Learning. This model improves two aspects of recently popular contrastive learning algorithms in natural language processing (NLP). Firstly, MoCoUTRL employs multi-granularity semantic contrastive learning objectives, enabling a more comprehensive understanding of the semantic features of samples. Secondly, MoCoUTRL uses a dynamic dictionary to act as the approximately ground-truth representation for each token, providing the pseudo labels for token-level contrastive learning. The MoCoUTRL can extend the use of pre-trained language models (PLM) and even large-scale language models (LLM) into a plug-and-play semantic feature extractor that can fuel multiple downstream tasks. Experimental results on several publicly available datasets and further theoretical analysis validate the effectiveness and interpretability of the proposed method in this paper.
引用
收藏
页数:20
相关论文
共 52 条
[1]  
Bachman P, 2019, ADV NEUR IN, V32
[2]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[3]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[4]  
Carlsson F., 2021, SEMANTIC RE TUNING C, P21
[5]  
Caron M, 2020, ADV NEUR IN, V33
[6]  
Cer Daniel., 2017, SEMEVAL ACL, P1, DOI [10.18653/v1/S17-2001, DOI 10.18653/V1/S17-2001]
[7]  
Chen T, 2020, PR MACH LEARN RES, V119
[8]  
Chowdhery A, 2022, Arxiv, DOI arXiv:2204.02311
[9]  
Chuang Yung-Sung, 2022, P 2022 C N AM CHAPT
[10]  
Dai ZH, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2978