Purifier: Plug-and-play Backdoor Mitigation for Pre-trained Models Via Anomaly Activation Suppression

被引:3
|
作者
Zhang, Xiaoyu [1 ,2 ]
Jin, Yulin [1 ]
Wang, Tao [1 ]
Lou, Jian [3 ]
Chen, Xiaofeng [1 ]
机构
[1] Xidian Univ, State Key Lab Integrated Serv Networks ISN, Xian 710071, Peoples R China
[2] State Key Lab Cryptol, POB 5159, Beijing 100878, Peoples R China
[3] Xidian Univ, Guangzhou Inst Technol, Xian, Peoples R China
来源
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年
基金
中国国家自然科学基金;
关键词
Backdoor attack and defense; Deep Neural Network;
D O I
10.1145/3503161.3548065
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pre-trained models have been widely adopted in deep learning development, benefiting the fine-tuning of downstream user-specific tasks with enormous computation saving. However, backdoor attacks pose severe security threat to the subsequent models built upon compromised pre-trained models, which call for effective countermeasures to mitigate the backdoor threat before deploying the victim models to safety-critical applications. This paper proposes Purifier: a novel backdoor mitigation framework for pretrained models via suppressing anomaly activation. Purifier is motivated by the observation that, for backdoor triggers, anomaly activation patterns exist across different perspectives (e.g., channelwise, cube-wise, and feature-wise), featuring different degrees of granularity. More importantly, choosing to suppress at the right granularity is vital to robustness and accuracy. To this end, Purifier is capable of defending against diverse types of backdoor triggers without any prior knowledge of the backdoor attacks, meanwhile featuring a convenient and flexible characteristic during deployment, i.e., plug-and-play-able. The extensive experimental results show, against a series of state-of-the-art mainstream attacks, that Purifier performs better in terms of both defense effectiveness and model inference accuracy on clean examples than the state-of-the-art methods. Our code and Appendix can be found in github.com/RUIYUN- ML/Purifier.
引用
收藏
页码:4291 / 4299
页数:9
相关论文
共 50 条
  • [1] Plug-and-Play Document Modules for Pre-trained Models
    Xiao, Chaojun
    Zhang, Zhengyan
    Han, Xu
    Chan, Chi-Min
    Lin, Yankai
    Liu, Zhiyuan
    Li, Xiangyang
    Li, Zhonghua
    Cao, Zhao
    Sun, Maosong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15713 - 15729
  • [2] Plug-and-Play Knowledge Injection for Pre-trained Language Models
    Zhang, Zhengyan
    Zeng, Zhiyuan
    Lin, Yankai
    Wang, Huadong
    Ye, Deming
    Xiao, Chaojun
    Han, Xu
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 10641 - 10656
  • [3] Pre-trained Diffusion Models for Plug-and-Play Medical Image Enhancement
    Ma, Jun
    Zhu, Yuanzhi
    You, Chenyu
    Wang, Bo
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT III, 2023, 14222 : 3 - 13
  • [4] Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
    Xiao, Chaojun
    Luo, Yuqi
    Mang, Wenbi
    Zhang, Pengle
    Han, Xu
    Lie, Yankai
    Zhang, Zhengyan
    Xie, Ruobing
    Liu, Zhiyuan
    Sun, Maosong
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9947 - 9959
  • [5] Aliasing Backdoor Attacks on Pre-trained Models
    Wei, Cheng'an
    Lee, Yeonjoon
    Chen, Kai
    Meng, Guozhu
    Lv, Peizhuo
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
  • [6] Mutual Information Guided Backdoor Mitigation for Pre-Trained Encoders
    Han, Tingxu
    Sun, Weisong
    Ding, Ziqi
    Fang, Chunrong
    Qian, Hanwei
    Li, Jiaxun
    Chen, Zhenyu
    Zhang, Xiangyu
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 3414 - 3428
  • [7] Noise Calibration: Plug-and-Play Content-Preserving Video Enhancement Using Pre-trained Video Diffusion Models
    Yang, Qinyu
    Chen, Haoxin
    Zhang, Yong
    Xia, Menghan
    Cun, Xiaodong
    Su, Zhixun
    Shan, Ying
    COMPUTER VISION - ECCV 2024, PT XXXVI, 2025, 15094 : 307 - 326
  • [8] PPT: Backdoor Attacks on Pre-trained Models via Poisoned Prompt Tuning
    Du, Wei
    Zhao, Yichun
    Li, Boqun
    Liu, Gongshen
    Wang, Shilin
    PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 680 - 686
  • [9] Backdoor Pre-trained Models Can Transfer to All
    Shen, Lujia
    Ji, Shouling
    Zhang, Xuhong
    Li, Jinfeng
    Chen, Jing
    Shi, Jie
    Fang, Chengfang
    Yin, Jianwei
    Wang, Ting
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 3141 - 3158
  • [10] A PLUG-AND-PLAY APPROACH TO MULTIPARAMETRIC QUANTITATIVE MRI: IMAGE RECONSTRUCTION USING PRE-TRAINED DEEP DENOISERS
    Fatania, Ketan
    Pirkl, Carolin M.
    Menzel, Marion, I
    Hall, Peter
    Golbabaee, Mohammad
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,