Purifier: Plug-and-play Backdoor Mitigation for Pre-trained Models Via Anomaly Activation Suppression

被引：3

作者：

Zhang, Xiaoyu ^{[1
,2
]}

Jin, Yulin ^{[1
]}

Wang, Tao ^{[1
]}

Lou, Jian ^{[3
]}

Chen, Xiaofeng ^{[1
]}

机构：

[1] Xidian Univ, State Key Lab Integrated Serv Networks ISN, Xian 710071, Peoples R China

[2] State Key Lab Cryptol, POB 5159, Beijing 100878, Peoples R China

[3] Xidian Univ, Guangzhou Inst Technol, Xian, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Backdoor attack and defense; Deep Neural Network;

D O I：

10.1145/3503161.3548065

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Pre-trained models have been widely adopted in deep learning development, benefiting the fine-tuning of downstream user-specific tasks with enormous computation saving. However, backdoor attacks pose severe security threat to the subsequent models built upon compromised pre-trained models, which call for effective countermeasures to mitigate the backdoor threat before deploying the victim models to safety-critical applications. This paper proposes Purifier: a novel backdoor mitigation framework for pretrained models via suppressing anomaly activation. Purifier is motivated by the observation that, for backdoor triggers, anomaly activation patterns exist across different perspectives (e.g., channelwise, cube-wise, and feature-wise), featuring different degrees of granularity. More importantly, choosing to suppress at the right granularity is vital to robustness and accuracy. To this end, Purifier is capable of defending against diverse types of backdoor triggers without any prior knowledge of the backdoor attacks, meanwhile featuring a convenient and flexible characteristic during deployment, i.e., plug-and-play-able. The extensive experimental results show, against a series of state-of-the-art mainstream attacks, that Purifier performs better in terms of both defense effectiveness and model inference accuracy on clean examples than the state-of-the-art methods. Our code and Appendix can be found in github.com/RUIYUN- ML/Purifier.

引用

页码：4291 / 4299

页数：9

共 50 条

[1] Plug-and-Play Document Modules for Pre-trained Models
Xiao, Chaojun
Zhang, Zhengyan
Han, Xu
Chan, Chi-Min
Lin, Yankai
Liu, Zhiyuan
Li, Xiangyang
Li, Zhonghua
Cao, Zhao
Sun, Maosong
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15713 - 15729
[2] Plug-and-Play Knowledge Injection for Pre-trained Language Models
Zhang, Zhengyan
Zeng, Zhiyuan
Lin, Yankai
Wang, Huadong
Ye, Deming
Xiao, Chaojun
Han, Xu
Liu, Zhiyuan
Li, Peng
Sun, Maosong
Zhou, Jie
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 10641 - 10656
[3] Pre-trained Diffusion Models for Plug-and-Play Medical Image Enhancement
Ma, Jun
Zhu, Yuanzhi
You, Chenyu
Wang, Bo
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT III, 2023, 14222 : 3 - 13
[4] Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
Xiao, Chaojun
Luo, Yuqi
Mang, Wenbi
Zhang, Pengle
Han, Xu
Lie, Yankai
Zhang, Zhengyan
Xie, Ruobing
Liu, Zhiyuan
Sun, Maosong
Zhou, Jie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9947 - 9959
[5] Aliasing Backdoor Attacks on Pre-trained Models
Wei, Cheng'an
Lee, Yeonjoon
Chen, Kai
Meng, Guozhu
Lv, Peizhuo
PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
[6] Mutual Information Guided Backdoor Mitigation for Pre-Trained Encoders
Han, Tingxu
Sun, Weisong
Ding, Ziqi
Fang, Chunrong
Qian, Hanwei
Li, Jiaxun
Chen, Zhenyu
Zhang, Xiangyu
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 3414 - 3428
[7] Noise Calibration: Plug-and-Play Content-Preserving Video Enhancement Using Pre-trained Video Diffusion Models
Yang, Qinyu
Chen, Haoxin
Zhang, Yong
Xia, Menghan
Cun, Xiaodong
Su, Zhixun
Shan, Ying
COMPUTER VISION - ECCV 2024, PT XXXVI, 2025, 15094 : 307 - 326
[8] PPT: Backdoor Attacks on Pre-trained Models via Poisoned Prompt Tuning
Du, Wei
Zhao, Yichun
Li, Boqun
Liu, Gongshen
Wang, Shilin
PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 680 - 686
[9] Backdoor Pre-trained Models Can Transfer to All
Shen, Lujia
Ji, Shouling
Zhang, Xuhong
Li, Jinfeng
Chen, Jing
Shi, Jie
Fang, Chengfang
Yin, Jianwei
Wang, Ting
CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 3141 - 3158
[10] A PLUG-AND-PLAY APPROACH TO MULTIPARAMETRIC QUANTITATIVE MRI: IMAGE RECONSTRUCTION USING PRE-TRAINED DEEP DENOISERS
Fatania, Ketan
Pirkl, Carolin M.
Menzel, Marion, I
Hall, Peter
Golbabaee, Mohammad
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,

← 1 2 3 4 5 →