Augmented Neural Fine-Tuning for Efficient Backdoor Purification

被引：1

作者：

Karim, Nazmul ^{[1
]}

Al Arafat, Abdullah ^{[2
]}

Khalid, Umar ^{[1
]}

Guo, Zhishan ^{[2
]}

Rahnavard, Nazanin ^{[1
]}

机构：

[1] Univ Cent Florida, Orlando, FL 32816 USA

[2] North Carolina State Univ, Raleigh, NC USA

来源：

COMPUTER VISION - ECCV 2024, PT LXXX | 2025年 / 15138卷

基金：

美国国家科学基金会;

关键词：

ATTACK;

D O I：

10.1007/978-3-031-72989-8_23

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks, where the behavior of DNNs can be compromised by utilizing certain types of triggers or poisoning mechanisms. State-of-the-art (SOTA) defenses employ too-sophisticated mechanisms that require either a computationally expensive adversarial search module for reverse-engineering the trigger distribution or an over-sensitive hyper-parameter selection module. Moreover, they offer sub-par performance in challenging scenarios, e.g., limited validation data and strong attacks. In this paper, we propose-Neural mask Fine-Tuning (NFT)-with an aim to optimally re-organize the neuron activities in a way that the effect of the backdoor is removed. Utilizing a simple data augmentation like MixUp, NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module. Our study further reveals that direct weight fine-tuning under limited validation data results in poor post-purification clean test accuracy, primarily due to overfitting issue. To overcome this, we propose to fine-tune neural masks instead of model weights. In addition, a mask regularizer has been devised to further mitigate the model drift during the purification process. The distinct characteristics of NFT render it highly efficient in both runtime and sample usage, as it can remove the backdoor even when a single sample is available from each class. We validate the effectiveness of NFT through extensive experiments covering the tasks of image classification, object detection, video action recognition, 3D point cloud, and natural language processing. We evaluate our method against 14 different attacks (LIRA, WaNet, etc.) on 11 benchmark data sets (ImageNet, UCF101, Pascal VOC, ModelNet, OpenSubtitles2012, etc.). Our code is available online in this GitHub Repository.

引用

页码：401 / 418

页数：18

共 67 条

[1] SSDA: Secure Source-Free Domain Adaptation
Ahmed, Sabbir
Al Arafat, Abdullah
Rizve, Mamshad Nayeem
Hossain, Rahim
Guo, Zhishan
Rakin, Adnan Siraj
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19123 - 19133
[2] [Anonymous], 2009, Master's thesis
[3] Barni M, 2019, IEEE IMAGE PROC, P101, DOI [10.1109/ICIP.2019.8802997, 10.1109/icip.2019.8802997]
[4] Machine Learning in Real-Time Internet of Things (IoT) Systems: A Survey
Bian, Jiang
Al Arafat, Abdullah
Xiong, Haoyi
Li, Jing
Li, Li
Chen, Hongyang
Wang, Jun
Dou, Dejing
Guo, Zhishan
[J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (11) : 8364 - 8386
[5] Bojar Ondrej, 2014, P 9 WORKSH STAT MACH, P12
[6] Carratino L, 2022, J MACH LEARN RES, V23
[7] Chai SW, 2022, Arxiv, DOI arXiv:2207.04497
[8] Chen HL, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4658
[9] Chen Kangjie, 2023, 11 INT C LEARN REPR
[10] Chen XY, 2017, Arxiv, DOI [arXiv:1712.05526, DOI 10.48550/ARXIV.1712.05526]

← 1 2 3 4 5 6 7 →