Generating Black-Box Adversarial Examples in Sparse Domain

被引：7

作者：

Zanddizari, Hadi ^{[1
]}

Zeinali, Behnam ^{[1
]}

Chang, J. Morris ^{[1
]}

机构：

[1] Univ S Florida, Dept Elect Engn, Tampa, FL 33620 USA

来源：

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2022年 / 6卷 / 04期

关键词：

Perturbation methods; Training; Discrete cosine transforms; Dictionaries; Frequency-domain analysis; Computational modeling; Adaptation models; Convolutional neural network; black-box attack; deep learning; sparse representation; SECURITY; ATTACKS;

D O I：

10.1109/TETCI.2021.3122467

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applications of machine learning (ML) models and convolutional neural networks (CNNs) have been rapidly increased. Although state-of-the-art CNNs provide high accuracy in many applications, recent investigations show that such networks are highly vulnerable to adversarial attacks. The black-box adversarial attack is one type of attack that the attacker does not have any knowledge about the model or the training dataset, but it has some input data set and their labels. In this paper, we propose a novel approach to generate a black-box attack in sparse domain whereas the most important information of an image can be observed. Our investigation shows that large sparse (LaS) components play a critical role in the performance of image classifiers. Under this presumption, to generate adversarial example, we transfer an image into a sparse domain and put a threshold to choose only $k$ LaS components. In contrast to the very recent works that randomly perturb $k$ low frequency (LoF) components, we perturb $k$ LaS components either randomly (query-based) or in the direction of the most correlated sparse signal from a different class. We show that LaS components contain some middle or higher frequency components information which leads fooling image classifiers with a fewer number of queries. We demonstrate the effectiveness of this approach by fooling six state-of-the-art image classifiers, the TensorFlow Lite (TFLite) model of Google Cloud Vision platform, and YOLOv5 model as an object detection algorithm. Mean squared error (MSE) and peak signal to noise ratio (PSNR) are used as quality metrics. We also present a theoretical proof to connect these metrics to the level of perturbation in the sparse domain.

引用

页码：795 / 804

页数：10

共 40 条

[1] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [J].

Akhtar, Naveed ;

Mian, Ajmal .

IEEE ACCESS, 2018, 6 :14410-14430

[2]

[Anonymous], 2018, Technical report on the cleverhans v2.1.0 adversarial examples library

[3]

[Anonymous], 2016, arXiv

[4]

[Anonymous], 2018, arXiv

[5] IEEE-SPS and connexions - An open access education collaboration [J].

Baraniuk, Richard G. ;

Burrus, C. Sidney ;

Thierstein, E. Joel .

IEEE SIGNAL PROCESSING MAGAZINE, 2007, 24 (06) :6-+

[6]

Barreno M., 2006, P 2006 ACM S INF COM, P16

[7] The security of machine learning [J].

Barreno, Marco ;

Nelson, Blaine ;

Joseph, Anthony D. ;

Tygar, J. D. .

MACHINE LEARNING, 2010, 81 (02) :121-148

[8] Security Evaluation of Pattern Classifiers under Attack [J].

Biggio, Battista ;

Fumera, Giorgio ;

Roli, Fabio .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (04) :984-996

[9] Image denoising via sparse and redundant representations over learned dictionaries [J].

Elad, Michael ;

Aharon, Michal .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2006, 15 (12) :3736-3745

[10] Dermatologist-level classification of skin cancer with deep neural networks [J].

Esteva, Andre ;

Kuprel, Brett ;

Novoa, Roberto A. ;

Ko, Justin ;

Swetter, Susan M. ;

Blau, Helen M. ;

Thrun, Sebastian .

NATURE, 2017, 542 (7639) :115-+

← 1 2 3 4 →