AdvEdge: Optimizing Adversarial Perturbations Against Interpretable Deep Learning

被引：8

作者：

Abdukhamidov, Eldor ^{[1
]}

Abuhamad, Mohammed ^{[2
]}

Juraev, Firuz ^{[1
]}

Chan-Tin, Eric ^{[2
]}

AbuHmed, Tamer ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Seoul, South Korea

[2] Loyola Univ Chicago, Chicago, IL 60660 USA

来源：

COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021 | 2021年 / 13116卷

基金：

新加坡国家研究基金会;

关键词：

Adversarial image; Deep learning; Interpretability;

D O I：

10.1007/978-3-030-91434-9_9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep Neural Networks (DNNs) have achieved state-of-theart performance in various applications. It is crucial to verify that the high accuracy prediction for a given task is derived from the correct problem representation and not from the misuse of artifacts in the data. Hence, interpretation models have become a key ingredient in developing deep learning models. Utilizing interpretation models enables a better understanding of how DNN models work, and offers a sense of security. However, interpretations are also vulnerable to malicious manipulation. We present AdvEdge and AdvEdge+, two attacks to mislead the target DNNs and deceive their combined interpretation models. We evaluate the proposed attacks against two DNN model architectures coupled with four representatives of different categories of interpretation models. The experimental results demonstrate our attacks' effectiveness in deceiving the DNN models and their interpreters.

引用

页码：93 / 105

页数：13

共 18 条

[1] Dabkowski P., 2017, arXiv
[2] Interpretable Explanations of Black Boxes by Meaningful Perturbation
Fong, Ruth C.
Vedaldi, Andrea
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3449 - 3457
[3] A Survey of Methods for Explaining Black Box Models
Guidotti, Riccardo
Monreale, Anna
Ruggieri, Salvatore
Turin, Franco
Giannotti, Fosca
Pedreschi, Dino
[J]. ACM COMPUTING SURVEYS, 2019, 51 (05)
[4] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[5] Towards Security Threats of Deep Learning Systems: A Survey
He, Yingzhe
Meng, Guozhu
Chen, Kai
Hu, Xingbo
He, Jinwen
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (05) : 1743 - 1770
[6] Densely Connected Convolutional Networks
Huang, Gao
Liu, Zhuang
van der Maaten, Laurens
Weinberger, Kilian Q.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
[7] Kindermans P.J., 2019, Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, P267
[8] Laidlaw C, 2019, ADV NEUR IN, V32
[9] Madry A, 2019, Arxiv, DOI arXiv:1706.06083
[10] Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
Papernot, Nicolas
McDaniel, Patrick
Wu, Xi
Jha, Somesh
Swami, Ananthram
[J]. 2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2016, : 582 - 597

← 1 2 →