POSTER: Black-box and Target-specific Attack Against Interpretable Deep Learning Systems

被引:4
作者
Abdukhamidov, Eldor [1 ]
Juraev, Firuz [1 ]
Abuhamad, Mohammed [2 ]
Abuhmed, Tamer [1 ]
机构
[1] Sungkyunkwan Univ, Suwon, South Korea
[2] Loyola Univ, Chicago, IL 60611 USA
来源
ASIA CCS'22: PROCEEDINGS OF THE 2022 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY | 2022年
基金
新加坡国家研究基金会;
关键词
Interpretable Machine Learning; Adversarial Machine Learning; Target-specific Attack; Single-class Attack; Genetic Algorithm;
D O I
10.1145/3488932.3527283
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural network models are susceptible to malicious manipulations even in the black-box settings. Providing explanations for DNN models offers a sense of security by human involvement, which reveals whether the sample is benign or adversarial even though previous studies achieved a high attack success rate. However, interpretable deep learning systems (IDLSes) are shown to be susceptible to adversarial manipulations in white-box settings. Attacking IDLSes in black-box settings is challenging and remains an open research domain. In this work, we propose a black-box version of the white-box AdvEdge approach against IDLSes, which is query-efficient and gradient-free without obtaining any knowledge of the target DNN model and its coupled interpreter. Our approach takes advantage of transfer-based and score-based techniques using the effective microbial genetic algorithm (MGA). We achieve a high attack success rate with a small number of queries and high similarity in interpretations between adversarial and benign samples.
引用
收藏
页码:1216 / 1218
页数:3
相关论文
共 5 条
  • [1] AdvEdge: Optimizing Adversarial Perturbations Against Interpretable Deep Learning
    Abdukhamidov, Eldor
    Abuhamad, Mohammed
    Juraev, Firuz
    Chan-Tin, Eric
    AbuHmed, Tamer
    [J]. COMPUTATIONAL DATA AND SOCIAL NETWORKS, CSONET 2021, 2021, 13116 : 93 - 105
  • [2] Harvey I., 2009, LNCS, P126
  • [3] Madry A, 2019, Arxiv, DOI arXiv:1706.06083
  • [4] Simonyan K, 2014, Arxiv, DOI arXiv:1312.6034
  • [5] Learning Deep Features for Discriminative Localization
    Zhou, Bolei
    Khosla, Aditya
    Lapedriza, Agata
    Oliva, Aude
    Torralba, Antonio
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2921 - 2929