FedITD: A Federated Parameter-Efficient Tuning With Pre-Trained Large Language Models and Transfer Learning Framework for Insider Threat Detection

被引:0
|
作者
Wang, Zhi Qiang [1 ]
Wang, Haopeng [1 ]
El Saddik, Abdulmotaleb [1 ,2 ]
机构
[1] Univ Ottawa, Sch Elect Engn & Comp Sci, Multimedia Commun Res Lab MCRLab, Ottawa, ON K1N 6N5, Canada
[2] MBZUAI, Dept Comp Vis, Abu Dhabi, U Arab Emirates
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Data models; Adaptation models; Threat assessment; Tuning; Security; Organizations; Costs; Computational modeling; Transfer learning; Deep learning; Computer security; Data augmentation; Artificial intelligence; Machine learning; Cybersecurity; insider threat; deep learning; transformer; BERT; RoBERTa; XLNet; DistilBERT; GPT; data augmentation; artificial intelligence; machine learning; pre-trained LLM; PETuning; adapter; LoRA; BitFit; LLM; NLP;
D O I
10.1109/ACCESS.2024.3482988
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Insider threats cause greater losses than external attacks, prompting organizations to invest in detection systems. However, there exist challenges: 1) Security and privacy concerns prevent data sharing, making it difficult to train robust models and identify new attacks. 2) The diversity and uniqueness of organizations require localized models, as a universal solution could be more effective. 3) High resource costs, delays, and data security concerns complicate building effective detection systems. This paper introduces FedITD, a flexible, hierarchy, and federated framework with local real-time detection systems, combining Large Language Models (LLM), Federated Learning (FL), Parameter Efficient Tuning (PETuning), and Transfer Learning (TF) for insider threat detection. FedITD uses FL to protect privacy while indirect integrating client information and employs PETuning methods (Adapter, BitFit, LoRA) with LLMs (BERT, RoBERTa, XLNet, DistilBERT) to reduce resource use and time delay. FedITD customizes client models and optimizes performance via transfer learning without central data transfer, further enhancing the detection of new attacks. FedITD outperforms other federated learning methods and its performance is very close to the best centrally trained method. Extensive experiment results show FedITD's superior performance, adaptability to varied data, and reduction of resource costs, achieving an optimal balance in detection capabilities across source data, unlabeled local data, and global data. Alternative PETuning implementations are also explored in this paper.
引用
收藏
页码:160396 / 160417
页数:22
相关论文
共 50 条
  • [31] VL-MPFT: Multitask Parameter-Efficient Fine-Tuning for Visual-Language Pre-trained Models via Task-Adaptive Masking
    Zhu, Min
    Liu, Guanming
    Wei, Zhihua
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 379 - 394
  • [32] Enhancing Scalability of Pre-trained Language Models via Efficient Parameter Sharing
    Liu, Peiyu
    Gao, Ze-Feng
    Chen, Yushuo
    Zhao, Wayne Xin
    Wen, Ji-Rong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13771 - 13785
  • [33] PreAdapter: Sparse Adaptive Parameter-efficient Transfer Learning for Language Models
    Mao, Chenyang
    Jin, Xiaoxiao
    Yue, Dengfeng
    Leng, Tuo
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 218 - 225
  • [34] Parameter-efficient fine-tuning of large language models using semantic knowledge tuning
    Prottasha, Nusrat Jahan
    Mahmud, Asif
    Sobuj, Md. Shohanur Islam
    Bhat, Prakash
    Kowsher, Md
    Yousefi, Niloofar
    Garibay, Ozlem Ozmen
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [35] FedBM: Stealing knowledge from pre-trained language models for heterogeneous federated learning
    Zhu, Meilu
    Yang, Qiushi
    Gao, Zhifan
    Yuan, Yixuan
    Liu, Jun
    MEDICAL IMAGE ANALYSIS, 2025, 102
  • [36] Parameter-Efficient Multi-classification Software Defect Detection Method Based on Pre-trained LLMs
    Wang, Xuanye
    Lu, Lu
    Yang, Zhanyu
    Tian, Qingyan
    Lin, Haisha
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [37] DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
    Chen, Xuxi
    Chen, Tianlong
    Chen, Weizhu
    Awadallah, Ahmed Hassan
    Wang, Zhangyang
    Cheng, Yu
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 8208 - 8222
  • [38] Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models
    Alnaasan, Nawras
    Huang, Horng-Ruey
    Shafi, Aamir
    Subramoni, Hari
    Panda, Dhabaleswar K.
    2024 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS, HOTI 2024, 2024, : 11 - 19
  • [39] AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
    Yin, Yichun
    Chen, Cheng
    Shang, Lifeng
    Jiang, Xin
    Chen, Xiao
    Liu, Qun
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 5146 - 5157
  • [40] Efficient Data Learning for Open Information Extraction with Pre-trained Language Models
    Fan, Zhiyuan
    He, Shizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13056 - 13063