Understanding deep learning defenses against adversarial examples through visualizations for dynamic risk assessment

被引:3
作者
Echeberria-Barrio, Xabier [1 ]
Gil-Lerchundi, Amaia [1 ]
Egana-Zubia, Jon [1 ]
Orduna-Urrutia, Raul [1 ]
机构
[1] Vicomtech Fdn, Basque Res & Technol Alliance BRTA, Mikeletegi 57, Donostia San Sebastian 20009, Spain
关键词
Adversarial attacks; Adversarial defenses; Visualization; ATTACKS;
D O I
10.1007/s00521-021-06812-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep neural network models have been developed in different fields, where they have brought many advances. However, they have also started to be used in tasks where risk is critical. Misdiagnosis of these models can lead to serious accidents or even death. This concern has led to an interest among researchers to study possible attacks on these models, discovering a long list of vulnerabilities, from which every model should be defended. The adversarial example attack is a widely known attack among researchers, who have developed several defenses to avoid such a threat. However, these defenses are as opaque as a deep neural network model, how they work is still unknown. This is why visualizing how they change the behavior of the target model is interesting in order to understand more precisely how the performance of the defended model is being modified. For this work, three defense strategies, against adversarial example attacks, have been selected in order to visualize the behavior modification of each of them in the defended model. Adversarial training, dimensionality reduction, and prediction similarity were the selected defenses, which have been developed using a model composed of convolution neural network layers and dense neural network layers. In each defense, the behavior of the original model has been compared with the behavior of the defended model, representing the target model by a graph in a visualization. This visualization allows identifying the vulnerabilities of the model and shows how the defenses try to avoid them.
引用
收藏
页码:20477 / 20490
页数:14
相关论文
共 34 条
  • [1] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
    Akhtar, Naveed
    Mian, Ajmal
    [J]. IEEE ACCESS, 2018, 6 : 14410 - 14430
  • [2] Bhagoji Arjun Nitin, 2018, 2018 52nd Annual Conference on Information Sciences and Systems (CISS), DOI 10.1109/CISS.2018.8362326
  • [3] Carlini N., 2017, P 10 ACM WORKSH ART, P3, DOI DOI 10.1145/3128572.3140444
  • [4] Automatic detection of invasive ductal carcinoma in whole slide images with Convolutional Neural Networks
    Cruz-Roa, Angel
    Basavanhally, Ajay
    Gonzalez, Fabio
    Gilmore, Hannah
    Feldman, Michael
    Ganesan, Shridar
    Shih, Natalie
    Tomaszewski, John
    Madabhushi, Anant
    [J]. MEDICAL IMAGING 2014: DIGITAL PATHOLOGY, 2014, 9041
  • [5] Boosting Adversarial Attacks with Momentum
    Dong, Yinpeng
    Liao, Fangzhou
    Pang, Tianyu
    Su, Hang
    Zhu, Jun
    Hu, Xiaolin
    Li, Jianguo
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9185 - 9193
  • [6] Dou Z, 2018, CORRABS181106492
  • [7] Echeberria-Barrio X., 2021, 13 INT C COMP INT SE
  • [8] Adversarial attacks on medical machine learning
    Finlayson, Samuel G.
    Bowers, John D.
    Ito, Joichi
    Zittrain, Jonathan L.
    Beam, Andrew L.
    Kohane, Isaac S.
    [J]. SCIENCE, 2019, 363 (6433) : 1287 - 1289
  • [9] SUMMIT: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations
    Hohman, Fred
    Park, Haekyu
    Robinson, Caleb
    Chau, Duen Horng
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (01) : 1096 - 1106
  • [10] Ilyas A., 2018, INT C MACHINE LEARNI, P2137