Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables

被引:0
|
作者
Shen, Ao [1 ]
Lai, Zhiquan [1 ]
Li, Dongsheng [1 ]
Hu, Xiaoyu [2 ]
机构
[1] Natl Univ Def Technol, Natl Key Lab Parallel & Distributed Comp, Changsha 410073, Peoples R China
[2] Acad Mil Sci, Strateg Assessments & Consultat Inst, Beijing 100091, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2025年 / 82卷 / 01期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Large-scale Language Model; Parameter-Efficient Fine-Tuning; parameter quantization; key variable; trainable; parameters; experimental analysis;
D O I
10.32604/cmc.2024.057491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale Language Models (LLMs) have achieved significant breakthroughs in Natural Language Processing (NLP), driven by the pre-training and fine-tuning paradigm. While this approach allows models to specialize in specific tasks with reduced training costs, the substantial memory requirements during fine-tuning present a barrier to broader deployment. Parameter-Efficient Fine-Tuning (PEFT) techniques, such as Low-Rank Adaptation (LoRA), and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency. Among these, QLoRA, which combines PEFT and quantization, has demonstrated notable success in reducing memory footprints during fine-tuning, prompting the development of various QLoRA variants. Despite these advancements, the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored. This study presents a comprehensive analysis of these key variables, focusing on their influence across different layer types and depths within LLM architectures. Our investigation uncovers several critical findings: (1) Larger layers, such as MLP layers, can maintain performance despite reductions in adapter rank, while smaller layers, like self-attention layers, are more sensitive to such changes; (2) The effectiveness of balancing factors depends more on specific values rather than layer type or depth; (3) In quantization-aware fine-tuning, larger layers can effectively utilize smaller adapters, whereas smaller layers struggle to do so. These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs. Moreover, for the same discount of trainable parameters, reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one. This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.
引用
收藏
页码:307 / 325
页数:19
相关论文
共 50 条
  • [21] Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
    Chen, Zixiang
    Deng, Yihe
    Yuan, Huizhuo
    Ji, Kaixuan
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235
  • [22] Fine-tuning natural language imperatives
    Kaufmann, Magdalena
    JOURNAL OF LOGIC AND COMPUTATION, 2019, 29 (03) : 321 - 348
  • [23] On Surgical Fine-tuning for Language Encoders
    Lodha, Abhilasha
    Belapurkar, Gayatri
    Chalkapurkar, Saloni
    Tao, Yuanming
    Ghosh, Reshmi
    Basu, Samyadeep
    Petrov, Dmitrii
    Srinivasan, Soundararajan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3105 - 3113
  • [24] Democratizing protein language models with parameter-efficient fine-tuning
    Sledzieski, Samuel
    Kshirsagar, Meghana
    Baek, Minkyung
    Dodhia, Rahul
    Ferres, Juan Lavista
    Berger, Bonnie
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (26)
  • [25] Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial
    Sommers, Frank
    Kongthon, Alisa
    Kongyoung, Sarawoot
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1319 - 1320
  • [26] Debiased Fine-Tuning for Vision-Language Models by Prompt Regularization
    Zhu, Beier
    Niu, Yulei
    Lee, Saeil
    Hur, Minhoe
    Zhang, Hanwang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3834 - 3842
  • [27] LLAMAFACTORY: Unified Efficient Fine-Tuning of 100+Language Models
    Zheng, Yaowei
    Zhang, Richong
    Zhang, Junhao
    Ye, Yanhan
    Luo, Zheyan
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 400 - 410
  • [28] Large language models in Radiology: The importance of fine-tuning and the fable of the luthier
    Martin-Noguerol, Teodoro
    Lopez-Ubeda, Pilar
    Luna, Antonio
    EUROPEAN JOURNAL OF RADIOLOGY, 2024, 178
  • [29] Fine-Tuning Language Models For Semi-Supervised Text Mining
    Chen, Xinyu
    Beaver, Ian
    Freeman, Cynthia
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3608 - 3617
  • [30] Distributed Inference and Fine-tuning of Large Language Models Over The Internet
    Borzunov, Alexander
    Ryabinin, Max
    Chumachenko, Artem
    Baranchuk, Dmitry
    Dettmers, Tim
    Belkada, Younes
    Samygin, Pavel
    Raffel, Colin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,