Hydra: Multi-head low-rank adaptation for parameter efficient fine-tuning

被引:3
作者
Kim, Sanghyeon [1 ]
Yang, Hyunmo [2 ]
Kim, Yunghyun [2 ]
Hong, Youngjoon [3 ]
Park, Eunbyung [1 ,2 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, 2066 Seobu Ro, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Artificial Intelligence, 2066 Seobu Ro, Suwon 16419, South Korea
[3] Korea Adv Inst Sci & Technol, Dept Math Sci, 291 Daehak Ro, Taejon 305701, South Korea
基金
新加坡国家研究基金会;
关键词
Parameter efficient fine-tuning; Adapter; Transformer; BENCHMARK;
D O I
10.1016/j.neunet.2024.106414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent surge in large-scale foundation models has spurred the development of efficient methods for adapting these models to various downstream tasks. Low-rank adaptation methods, such as LoRA, have gained significant attention due to their outstanding parameter efficiency and no additional inference latency. This paper investigates a more general form of adapter module based on the analysis that parallel and sequential adaptation branches learn novel and general features during fine-tuning, respectively. The proposed method, named Hydra, combines parallel and sequential branch to integrate capabilities, which is more expressive than existing single branch methods and enables the exploration of a broader range of optimal points in the finetuning process. In addition, the proposed method explicitly leverages the pre-trained weights by performing a linear combination of the pre-trained features. It allows the learned features to have better generalization performance across diverse downstream tasks. Furthermore, we perform a comprehensive analysis of the characteristics of each adaptation branch with empirical evidence. Through an extensive range of experiments, we substantiate the efficiency and demonstrate the superior performance of Hydra. This comprehensive evaluation underscores the potential impact and effectiveness of Hydra in a variety of applications. The source code of this work is publicly opened on https://github.com/extremebird/Hydra.
引用
收藏
页数:11
相关论文
共 92 条
  • [1] [Anonymous], 2009, LEARNING MULTIPLE LA
  • [2] Bapna A, 2019, Arxiv, DOI [arXiv:1909.08478, 10.48550/arXiv.1909.08478]
  • [3] Beattie C, 2016, Arxiv, DOI arXiv:1612.03801
  • [4] Bentivogli L., 2009, TAC
  • [5] Bossard L, 2014, LECT NOTES COMPUT SC, V8694, P446, DOI 10.1007/978-3-319-10599-4_29
  • [6] Brown T. B., 2020, ARXIV
  • [7] Cer Daniel, 2017, arXiv
  • [8] Chavan A, 2023, Arxiv, DOI arXiv:2306.07967
  • [9] Chen SF, 2022, Arxiv, DOI arXiv:2205.13535
  • [10] Remote Sensing Image Scene Classification: Benchmark and State of the Art
    Cheng, Gong
    Han, Junwei
    Lu, Xiaoqiang
    [J]. PROCEEDINGS OF THE IEEE, 2017, 105 (10) : 1865 - 1883