Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

被引：0

作者：

Cha, Sungmin ^{[1
]}

Cho, Sungjun ^{[2
]}

Hwang, Dasol ^{[2
]}

Lee, Honglak ^{[2
]}

Moon, Taesup ^{[3
]}

Lee, Moontae ^{[2
,4
]}

机构：

[1] New York Univ, New York, NY USA

[2] LG AI Res, Seoul, South Korea

[3] Seoul Natl Univ, INMC, ASRI, Seoul, South Korea

[4] Univ Illinois, Chicago, IL USA

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since the recent advent of regulations for data protection (e.g., the General Data Protection Regulation), there has been increasing demand in deleting information learned from sensitive data in pre-trained models without retraining from scratch. The inherent vulnerability of neural networks towards adversarial attacks and unfairness also calls for a robust method to remove or correct information in an instancewise fashion, while retaining the predictive performance across remaining data. To this end, we consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model, by either misclassifying each instance away from its original prediction or relabeling the instance to a different label. We also propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information. Both methods only require the pretrained model and data instances to forget, allowing painless application to real-life settings where the entire training set is unavailable. Through extensive experimentation on various image classification benchmarks, we show that our approach effectively preserves knowledge of remaining data while unlearning given instances in both single-task and continual unlearning scenarios.

引用

页码：11186 / 11194

页数：9

共 51 条

[1] Aljundi R., 2019, INT C LEARN REPR ICL
[2] Memory Aware Synapses: Learning What (not) to Forget
Aljundi, Rahaf
Babiloni, Francesca
Elhoseiny, Mohamed
Rohrbach, Marcus
Tuytelaars, Tinne
[J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 144 - 161
[3] Alvarez JM, 2016, ADV NEUR IN, V29
[4] Brophy J, 2021, PR MACH LEARN RES, V139
[5] Towards Making Systems Forget with Machine Unlearning
Cao, Yinzhi
Yang, Junfeng
[J]. 2015 IEEE SYMPOSIUM ON SECURITY AND PRIVACY SP 2015, 2015, : 463 - 480
[6] Towards Evaluating the Robustness of Neural Networks
Carlini, Nicholas
Wagner, David
[J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
[7] Cha Sungmin, 2021, INT C LEARN REPR, P2
[8] Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence
Chaudhry, Arslan
Dokania, Puneet K.
Ajanthan, Thalaiyasingam
Torr, Philip H. S.
[J]. COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 556 - 572
[9] Chundawat VS, 2022, Arxiv, DOI arXiv:2201.05629
[10] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 6 →