Defeating Misclassification Attacks Against Transfer Learning

被引：3

作者：

Wu, Bang ^{[1
]}

Wang, Shuo ^{[2
]}

Yuan, Xingliang ^{[1
]}

Wang, Cong ^{[3
]}

Rudolph, Carsten ^{[1
]}

Yang, Xiangwen ^{[1
]}

机构：

[1] Monash Univ, Dept Informat Technol, Clayton, Vic 3800, Australia

[2] CSIRO Data61, Clayton, Vic 3168, Australia

[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING | 2023年 / 20卷 / 02期

关键词：

Transfer learning; Task analysis; Mathematical models; Training; Computational modeling; Data models; Perturbation methods; Deep neural network; defence against adversarial examples; transfer learning; pre-trained model;

D O I：

10.1109/TDSC.2022.3144988

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transfer learning is prevalent as a technique to efficiently generate new models (Student models) based on the knowledge transferred from a pre-trained model (Teacher model). However, Teacher models are often publicly available for sharing and reuse, which inevitably introduces vulnerability to trigger severe attacks against transfer learning systems. In this article, we take a first step towards mitigating one of the most advanced misclassification attacks in transfer learning. We design a distilled differentiator via activation-based network pruning to enervate the attack transferability while retaining accuracy. We adopt an ensemble structure from variant differentiators to improve the defence robustness. To avoid the bloated ensemble size during inference, we propose a two-phase defence, in which inference from the Student model is first performed to narrow down the candidate differentiators to be assembled, and later only a small, fixed number of them can be chosen to validate clean or reject adversarial inputs effectively. Our comprehensive evaluations on both large and small image recognition tasks confirm that the Student models with our defence of only 5 differentiators are immune to over 90% of the adversarial inputs with an accuracy loss of less than 10%. Our comparison also demonstrates that our design outperforms prior problematic defences.

引用

页码：886 / 901

页数：16

共 50 条

[1] Safe Machine Learning and Defeating Adversarial Attacks
Rouhani, Bita Darvish
Samragh, Mohammad
Javidi, Tara
Koushanfar, Farinaz
IEEE SECURITY & PRIVACY, 2019, 17 (02) : 31 - 38
[2] Rethinking Membership Inference Attacks Against Transfer Learning
Wu, Cong
Chen, Jing
Fang, Qianru
He, Kun
Zhao, Ziming
Ren, Hao
Xu, Guowen
Liu, Yang
Xiang, Yang
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 6441 - 6454
[3] Teacher Model Fingerprinting Attacks Against Transfer Learning
Chen, Yufei
Shen, Chao
Wang, Cong
Zhang, Yang
PROCEEDINGS OF THE 31ST USENIX SECURITY SYMPOSIUM, 2022, : 3593 - 3610
[4] Membership inference attacks against transfer learning for generalized model
Chen, Jinyin
Shangguan, Wenchang
Zhang, Jingjing
Zheng, Haibin
Zheng, Yayu
Zhang, Xu-Hong
Tongxin Xuebao/Journal on Communications, 2021, 42 (10): : 197 - 210
[5] Detecting and Defeating Advanced Man-In-The-Middle Attacks against TLS
de la Hoz, Enrique
Paez-Reyes, Rafael
Cochrane, Gary
Marsa-Maestre, Ivan
Manuel Moreira-Lemus, Jose
Alarcos, Bernardo
2014 6TH INTERNATIONAL CONFERENCE ON CYBER CONFLICT (CYCON 2014), 2014, : 209 - +
[6] Defeating against sybil-attacks in peer-to-peer networks
Xiang, Xu
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1218 - 1222
[7] Explaining Misclassification and Attacks in Deep Learning via Random Forests
Haffar, Rami
Domingo-Ferrer, Josep
Sanchez, David
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2020), 2020, 12256 : 273 - 285
[8] Towards Defeating DDoS Attacks
Doyal, Alex
Zhan, Justin
Yu, Huiming Anna
2012 ASE INTERNATIONAL CONFERENCE ON CYBER SECURITY (CYBERSECURITY), 2012, : 209 - 212
[9] Defending against attacks tailored to transfer learning via feature distancing
Ji, Sangwoo
Park, Namgyu
Na, Dongbin
Zhu, Bin
Kim, Jong
COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 223
[10] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
Wang, Shuo
Nepal, Surya
Rudolph, Carsten
Grobler, Marthie
Chen, Shangyu
Chen, Tianle
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539

← 1 2 3 4 5 →