Towards Understanding and Enhancing Robustness of Deep Learning Models against Malicious Unlearning Attacks

被引:13
作者
Qian, Wei [1 ]
Zhao, Chenxu [1 ]
Le, Wei [1 ]
Ma, Meiyi [2 ]
Huai, Mengdi [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
[2] Vanderbilt Univ, Nashville, TN USA
来源
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年
基金
美国国家科学基金会;
关键词
Deep learning; data deletion; malicious attacks; security and privacy;
D O I
10.1145/3580305.3599526
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given the availability of abundant data, deep learning models have been advanced and become ubiquitous in the past decade. In practice, due to many different reasons (e.g., privacy, usability, and fidelity), individuals also want the trained deep models to forget some specific data. Motivated by this, machine unlearning (also known as selective data forgetting) has been intensively studied, which aims at removing the influence that any particular training sample had on the trained model during the unlearning process. However, people usually employ machine unlearning methods as trusted basic tools and rarely have any doubt about their reliability. In fact, the increasingly critical role of machine unlearning makes deep learning models susceptible to the risk of being maliciously attacked. To well understand the performance of deep learning models in malicious environments, we believe that it is critical to study the robustness of deep learning models to malicious unlearning attacks, which happen during the unlearning process. To bridge this gap, in this paper, we first demonstrate that malicious unlearning attacks pose immense threats to the security of deep learning systems. Specifically, we present a broad class of malicious unlearning attacks wherein maliciously crafted unlearning requests trigger deep learning models to misbehave on target samples in a highly controllable and predictable manner. In addition, to improve the robustness of deep learning models, we also present a general defense mechanism, which aims to identify and unlearn effective malicious unlearning requests based on their gradient influence on the unlearned models. Further, theoretical analyses are conducted to analyze the proposed methods. Extensive experiments on real-world datasets validate the vulnerabilities of deep learning models to malicious unlearning attacks and the effectiveness of the introduced defense mechanism.
引用
收藏
页码:1932 / 1942
页数:11
相关论文
共 99 条
[1]   Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved Transferability [J].
Aghakhani, Hojjat ;
Meng, Dongyu ;
Wang, Yu-Xiang ;
Kruegel, Christopher ;
Vigna, Giovanni .
2021 IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2021), 2021, :159-178
[2]  
Baharav TZ, 2022, PR MACH LEARN RES, V151, P108
[3]   Machine Unlearning [J].
Bourtoule, Lucas ;
Chandrasekaran, Varun ;
Choquette-Choo, Christopher A. ;
Jia, Hengrui ;
Travers, Adelin ;
Zhang, Baiwu ;
Lie, David ;
Papernot, Nicolas .
2021 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2021, :141-159
[4]  
Bourtoule Lucas, 2021, P 42 IEEE S SEC PRIV, P141
[5]   Subset Replay based Continual Learning for Scalable Improvement of Autonomous Systems [J].
Brahma, Pratik Prabhanjan ;
Othon, Adrienne .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :1179-1187
[6]  
Brophy J, 2021, PR MACH LEARN RES, V139
[7]   Efficient Repair of Polluted Machine Learning Systems via Causal Unlearning [J].
Cao, Yinzhi ;
Yu, Alexander Fangxiao ;
Aday, Andrew ;
Stahl, Eric ;
Merwine, Jon ;
Yang, Junfeng .
PROCEEDINGS OF THE 2018 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (ASIACCS'18), 2018, :735-747
[8]   Towards Making Systems Forget with Machine Unlearning [J].
Cao, Yinzhi ;
Yang, Junfeng .
2015 IEEE SYMPOSIUM ON SECURITY AND PRIVACY SP 2015, 2015, :463-480
[9]  
Chaudhuri Arghya Roy, 2023, AS C MACH LEARN, P169
[10]   Recommendation Unlearning [J].
Chen, Chong ;
Sun, Fei ;
Zhang, Min ;
Ding, Bolin .
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, :2768-2777