Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

被引：0

作者：

Jung, Euna ^{[1
]}

Kim, Jaeill ^{[2
]}

Ko, Jungmin ^{[3
]}

Park, Jinwoo ^{[1
]}

Rhee, Wonjong ^{[3
,4
,5
]}

机构：

[1] Samsung Adv Inst Technol, Suwon 16678, Gyeonggi Do, South Korea

[2] LINE Investment Technol, Seongnam Si 13529, Gyeonggi Do, South Korea

[3] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, Seoul 08826, South Korea

[4] Seoul Natl Univ, Dept Intelligence & Informat, Seoul 08826, South Korea

[5] Seoul Natl Univ, RICS, Seoul 08826, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

新加坡国家研究基金会;

关键词：

Training; Linguistics; Contrastive learning; Market research; Correlation; Semantics; Visualization; Phase measurement; Natural language processing; Loss measurement; Sentence embedding; self-supervised learning; contrastive learning; fine-tuning; representation rank;

D O I：

10.1109/ACCESS.2024.3485705

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods. The code is available at (https://github.com/SNU-DRL/SentenceEmbedding_Rank).

引用

页码：159877 / 159888

页数：12

共 37 条

[1] Fine-Tuning of Word Embeddings for Semantic Sentiment Analysis
Atzeni, Mattia
Recupero, Diego Reforgiato
SEMANTIC WEB CHALLENGES, SEMWEBEVAL 2018, 2018, 927 : 140 - 150
[2] Sentence embedding and fine-tuning to automatically identify duplicate bugs
Isotani, Haruna
Washizaki, Hironori
Fukazawa, Yoshiaki
Nomoto, Tsutomu
Ouji, Saori
Saito, Shinobu
FRONTIERS IN COMPUTER SCIENCE, 2023, 4
[3] Duplicate Bug Report Detection by Using Sentence Embedding and Fine-tuning
Isotani, Haruna
Washizaki, Hironori
Fukazawa, Yoshiaki
Nomoto, Tsutomu
Ouji, Saori
Saito, Shinobu
2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, : 535 - 544
[4] FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning
Oh, Sejoon
Ustun, Berk
Mcauley, Julian
Kumar, Srijan
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (09)
[5] Rank-Two Correction and Fine-Tuning for Adaptive Byzantine Recovery in Federated Learning
Wang, Xinghan
Yang, Tingting
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (02): : 1760 - 1773
[6] Incorporating Scenario Knowledge into A Unified Fine-tuning Architecture for Event Representation
Zheng, Jianming
Cai, Fei
Chen, Honghui
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 249 - 258
[7] RAFNet: Interdomain Representation Alignment and Fine-Tuning for Image Series Classification
Gong, Maoguo
Qiao, Wenyuan
Li, Hao
Qin, A. K.
Gao, Tianqi
Luo, Tianshi
Xing, Lining
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[8] ALBERT-based fine-tuning model for cyberbullying analysis
Tripathy, Jatin Karthik
Chakkaravarthy, S. Sibi
Satapathy, Suresh Chandra
Sahoo, Madhulika
Vaidehi, V.
MULTIMEDIA SYSTEMS, 2022, 28 (06) : 1941 - 1949
[9] Federated Low-Rank Adaptation for Large Models Fine-Tuning Over Wireless Networks
Sun, Haofeng
Tian, Hui
Ni, Wanli
Zheng, Jingheng
Niyato, Dusit
Zhang, Ping
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2025, 24 (01) : 659 - 675
[10] ALBERT-based fine-tuning model for cyberbullying analysis
Jatin Karthik Tripathy
S. Sibi Chakkaravarthy
Suresh Chandra Satapathy
Madhulika Sahoo
V. Vaidehi
Multimedia Systems, 2022, 28 : 1941 - 1949

← 1 2 3 4 →