POSTER: Black-box and Target-specific Attack Against Interpretable Deep Learning Systems

被引：4

作者：

Abdukhamidov, Eldor ^{[1
]}

Juraev, Firuz ^{[1
]}

Abuhamad, Mohammed ^{[2
]}

Abuhmed, Tamer ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Suwon, South Korea

[2] Loyola Univ, Chicago, IL 60611 USA

来源：

ASIA CCS'22: PROCEEDINGS OF THE 2022 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY | 2022年

基金：

新加坡国家研究基金会;

关键词：

Interpretable Machine Learning; Adversarial Machine Learning; Target-specific Attack; Single-class Attack; Genetic Algorithm;

D O I：

10.1145/3488932.3527283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural network models are susceptible to malicious manipulations even in the black-box settings. Providing explanations for DNN models offers a sense of security by human involvement, which reveals whether the sample is benign or adversarial even though previous studies achieved a high attack success rate. However, interpretable deep learning systems (IDLSes) are shown to be susceptible to adversarial manipulations in white-box settings. Attacking IDLSes in black-box settings is challenging and remains an open research domain. In this work, we propose a black-box version of the white-box AdvEdge approach against IDLSes, which is query-efficient and gradient-free without obtaining any knowledge of the target DNN model and its coupled interpreter. Our approach takes advantage of transfer-based and score-based techniques using the effective microbial genetic algorithm (MGA). We achieve a high attack success rate with a small number of queries and high similarity in interpretations between adversarial and benign samples.

引用

页码：1216 / 1218

页数：3

共 5 条

[1] AdvEdge: Optimizing Adversarial Perturbations Against Interpretable Deep Learning
Abdukhamidov, Eldor
Abuhamad, Mohammed
Juraev, Firuz
Chan-Tin, Eric
AbuHmed, Tamer
[J]. COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021, 2021, 13116 : 93 - 105
[2] Harvey I., 2009, LNCS, P126
[3] Madry A, 2019, Arxiv, DOI arXiv:1706.06083
[4] Simonyan K, 2014, Arxiv, DOI arXiv:1312.6034
[5] Learning Deep Features for Discriminative Localization
Zhou, Bolei
Khosla, Aditya
Lapedriza, Agata
Oliva, Aude
Torralba, Antonio
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2921 - 2929

← 1 →