Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

被引:0
|
作者
Jung, Euna [1 ]
Kim, Jaeill [2 ]
Ko, Jungmin [3 ]
Park, Jinwoo [1 ]
Rhee, Wonjong [3 ,4 ,5 ]
机构
[1] Samsung Adv Inst Technol, Suwon 16678, Gyeonggi Do, South Korea
[2] LINE Investment Technol, Seongnam Si 13529, Gyeonggi Do, South Korea
[3] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, Seoul 08826, South Korea
[4] Seoul Natl Univ, Dept Intelligence & Informat, Seoul 08826, South Korea
[5] Seoul Natl Univ, RICS, Seoul 08826, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Training; Linguistics; Contrastive learning; Market research; Correlation; Semantics; Visualization; Phase measurement; Natural language processing; Loss measurement; Sentence embedding; self-supervised learning; contrastive learning; fine-tuning; representation rank;
D O I
10.1109/ACCESS.2024.3485705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods. The code is available at (https://github.com/SNU-DRL/SentenceEmbedding_Rank).
引用
收藏
页码:159877 / 159888
页数:12
相关论文
共 37 条
  • [1] Fine-Tuning of Word Embeddings for Semantic Sentiment Analysis
    Atzeni, Mattia
    Recupero, Diego Reforgiato
    SEMANTIC WEB CHALLENGES, SEMWEBEVAL 2018, 2018, 927 : 140 - 150
  • [2] Sentence embedding and fine-tuning to automatically identify duplicate bugs
    Isotani, Haruna
    Washizaki, Hironori
    Fukazawa, Yoshiaki
    Nomoto, Tsutomu
    Ouji, Saori
    Saito, Shinobu
    FRONTIERS IN COMPUTER SCIENCE, 2023, 4
  • [3] Duplicate Bug Report Detection by Using Sentence Embedding and Fine-tuning
    Isotani, Haruna
    Washizaki, Hironori
    Fukazawa, Yoshiaki
    Nomoto, Tsutomu
    Ouji, Saori
    Saito, Shinobu
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2021), 2021, : 535 - 544
  • [4] FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning
    Oh, Sejoon
    Ustun, Berk
    Mcauley, Julian
    Kumar, Srijan
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (09)
  • [5] Rank-Two Correction and Fine-Tuning for Adaptive Byzantine Recovery in Federated Learning
    Wang, Xinghan
    Yang, Tingting
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (02): : 1760 - 1773
  • [6] Incorporating Scenario Knowledge into A Unified Fine-tuning Architecture for Event Representation
    Zheng, Jianming
    Cai, Fei
    Chen, Honghui
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 249 - 258
  • [7] RAFNet: Interdomain Representation Alignment and Fine-Tuning for Image Series Classification
    Gong, Maoguo
    Qiao, Wenyuan
    Li, Hao
    Qin, A. K.
    Gao, Tianqi
    Luo, Tianshi
    Xing, Lining
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [8] ALBERT-based fine-tuning model for cyberbullying analysis
    Tripathy, Jatin Karthik
    Chakkaravarthy, S. Sibi
    Satapathy, Suresh Chandra
    Sahoo, Madhulika
    Vaidehi, V.
    MULTIMEDIA SYSTEMS, 2022, 28 (06) : 1941 - 1949
  • [9] Federated Low-Rank Adaptation for Large Models Fine-Tuning Over Wireless Networks
    Sun, Haofeng
    Tian, Hui
    Ni, Wanli
    Zheng, Jingheng
    Niyato, Dusit
    Zhang, Ping
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2025, 24 (01) : 659 - 675
  • [10] ALBERT-based fine-tuning model for cyberbullying analysis
    Jatin Karthik Tripathy
    S. Sibi Chakkaravarthy
    Suresh Chandra Satapathy
    Madhulika Sahoo
    V. Vaidehi
    Multimedia Systems, 2022, 28 : 1941 - 1949