Defeating Misclassification Attacks Against Transfer Learning

被引:3
|
作者
Wu, Bang [1 ]
Wang, Shuo [2 ]
Yuan, Xingliang [1 ]
Wang, Cong [3 ]
Rudolph, Carsten [1 ]
Yang, Xiangwen [1 ]
机构
[1] Monash Univ, Dept Informat Technol, Clayton, Vic 3800, Australia
[2] CSIRO Data61, Clayton, Vic 3168, Australia
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
Transfer learning; Task analysis; Mathematical models; Training; Computational modeling; Data models; Perturbation methods; Deep neural network; defence against adversarial examples; transfer learning; pre-trained model;
D O I
10.1109/TDSC.2022.3144988
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transfer learning is prevalent as a technique to efficiently generate new models (Student models) based on the knowledge transferred from a pre-trained model (Teacher model). However, Teacher models are often publicly available for sharing and reuse, which inevitably introduces vulnerability to trigger severe attacks against transfer learning systems. In this article, we take a first step towards mitigating one of the most advanced misclassification attacks in transfer learning. We design a distilled differentiator via activation-based network pruning to enervate the attack transferability while retaining accuracy. We adopt an ensemble structure from variant differentiators to improve the defence robustness. To avoid the bloated ensemble size during inference, we propose a two-phase defence, in which inference from the Student model is first performed to narrow down the candidate differentiators to be assembled, and later only a small, fixed number of them can be chosen to validate clean or reject adversarial inputs effectively. Our comprehensive evaluations on both large and small image recognition tasks confirm that the Student models with our defence of only 5 differentiators are immune to over 90% of the adversarial inputs with an accuracy loss of less than 10%. Our comparison also demonstrates that our design outperforms prior problematic defences.
引用
收藏
页码:886 / 901
页数:16
相关论文
共 50 条
  • [1] Safe Machine Learning and Defeating Adversarial Attacks
    Rouhani, Bita Darvish
    Samragh, Mohammad
    Javidi, Tara
    Koushanfar, Farinaz
    IEEE SECURITY & PRIVACY, 2019, 17 (02) : 31 - 38
  • [2] Rethinking Membership Inference Attacks Against Transfer Learning
    Wu, Cong
    Chen, Jing
    Fang, Qianru
    He, Kun
    Zhao, Ziming
    Ren, Hao
    Xu, Guowen
    Liu, Yang
    Xiang, Yang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 6441 - 6454
  • [3] Teacher Model Fingerprinting Attacks Against Transfer Learning
    Chen, Yufei
    Shen, Chao
    Wang, Cong
    Zhang, Yang
    PROCEEDINGS OF THE 31ST USENIX SECURITY SYMPOSIUM, 2022, : 3593 - 3610
  • [4] Membership inference attacks against transfer learning for generalized model
    Chen, Jinyin
    Shangguan, Wenchang
    Zhang, Jingjing
    Zheng, Haibin
    Zheng, Yayu
    Zhang, Xu-Hong
    Tongxin Xuebao/Journal on Communications, 2021, 42 (10): : 197 - 210
  • [5] Detecting and Defeating Advanced Man-In-The-Middle Attacks against TLS
    de la Hoz, Enrique
    Paez-Reyes, Rafael
    Cochrane, Gary
    Marsa-Maestre, Ivan
    Manuel Moreira-Lemus, Jose
    Alarcos, Bernardo
    2014 6TH INTERNATIONAL CONFERENCE ON CYBER CONFLICT (CYCON 2014), 2014, : 209 - +
  • [6] Defeating against sybil-attacks in peer-to-peer networks
    Xiang, Xu
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1218 - 1222
  • [7] Explaining Misclassification and Attacks in Deep Learning via Random Forests
    Haffar, Rami
    Domingo-Ferrer, Josep
    Sanchez, David
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2020), 2020, 12256 : 273 - 285
  • [8] Towards Defeating DDoS Attacks
    Doyal, Alex
    Zhan, Justin
    Yu, Huiming Anna
    2012 ASE INTERNATIONAL CONFERENCE ON CYBER SECURITY (CYBERSECURITY), 2012, : 209 - 212
  • [9] Defending against attacks tailored to transfer learning via feature distancing
    Ji, Sangwoo
    Park, Namgyu
    Na, Dongbin
    Zhu, Bin
    Kim, Jong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 223
  • [10] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539