MoCoUTRL: a momentum contrastive framework for unsupervised text representation learning

被引:0
作者
Zou, Ao [1 ]
Hao, Wenning [1 ]
Jin, Dawei [1 ]
Chen, Gang [1 ]
Sun, Feiyan [1 ,2 ]
机构
[1] Army Engn Univ PLA, Command & Control Engn Coll, Nanjing, Peoples R China
[2] Jinling Inst Technol, Nanjing, Peoples R China
关键词
Natural language processing; text representation learning; momentum contrast; alignment; uniformity;
D O I
10.1080/09540091.2023.2221406
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents MoCoUTRL: a Momentum Contrastive Framework for Unsupervised Text Representation Learning. This model improves two aspects of recently popular contrastive learning algorithms in natural language processing (NLP). Firstly, MoCoUTRL employs multi-granularity semantic contrastive learning objectives, enabling a more comprehensive understanding of the semantic features of samples. Secondly, MoCoUTRL uses a dynamic dictionary to act as the approximately ground-truth representation for each token, providing the pseudo labels for token-level contrastive learning. The MoCoUTRL can extend the use of pre-trained language models (PLM) and even large-scale language models (LLM) into a plug-and-play semantic feature extractor that can fuel multiple downstream tasks. Experimental results on several publicly available datasets and further theoretical analysis validate the effectiveness and interpretability of the proposed method in this paper.
引用
收藏
页数:20
相关论文
共 52 条
[41]  
Qian J, 2022, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), P2912
[42]  
Radford A., 2018, Language models are unsupervised multitask learners
[43]  
Raffel C, 2020, J MACH LEARN RES, V21
[44]  
Reimers N, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3982
[45]  
Keskar NS, 2019, Arxiv, DOI [arXiv:1909.05858, 10.48550/arXiv.1909.05858]
[46]  
Su J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2103.15316
[47]  
Touvron H, 2023, Arxiv, DOI [arXiv:2302.13971, 10.48550/arXiv.2302.13971.09685]
[48]  
van den Oord A, 2019, Arxiv, DOI arXiv:1807.03748
[49]  
Wang T., 2020, INT C MACHINE LEARNI
[50]  
Wolf T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, P38