Stealthy Backdoor Attack Based on Singular Value Decomposition

被引：0

作者：

Wu S.-X. ^{[1
]}

Yin Y.-Y. ^{[1
]}

Song S.-Q. ^{[1
]}

Chen G.-H. ^{[1
]}

Sang J.-T. ^{[1
]}

Yu J. ^{[1
]}

机构：

[1] School of Computer and Information Technology, Beijing Jiaotong University, Beijing

来源：

Ruan Jian Xue Bao/Journal of Software | 2024年 / 35卷 / 05期

关键词：

attack success rate; backdoor attack; singular value decomposition; stealthy;

D O I：

10.13328/j.cnki.jos.006949

中图分类号：

学科分类号：

摘要：

Deep neural networks can be affected by well-designed backdoor attacks during training. Such attacks are an attack method that controls the model output during tests by injecting data with backdoor labels into the training set. The attacked model performs normally on a clean test set but will be misclassified as the attack target class when the backdoor labels are recognized. The currently available backdoor attack methods have poor invisibility and are still expected to achieve a higher attack success rate. A backdoor attack method based on singular value decomposition is proposed to address the above limitations. The method proposed can be implemented in two ways: One is to directly set some singular values of the picture to zero, and the obtained picture is compressed to a certain extent and can be used as an effective backdoor triggering label. The other is to inject the singular vector information of the attack target class into the left and right singular vectors of the picture, which can also achieve an effective backdoor attack. The backdoor pictures obtained in the two kinds of processing ways are basically the same as the original picture from a visual point of view. According to the experiments, the proposed method proves that singular value decomposition can be effectively leveraged in backdoor attack algorithms to attack neural networks with considerably high success rates on multiple datasets. © 2024 Chinese Academy of Sciences. All rights reserved.

引用

页码：2400 / 2413

页数：13

共 56 条

[1] Krizhevsky A, Sutskever I, Hinton GE., ImageNet classification with deep convolutional neural networks, Proc. of the 25th Int’l Conf. on Neural Information Processing Systems, pp. 1097-1105, (2012)
[2] Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P., Natural language processing (almost) from scratch, The Journal of Machine Learning Research, 12, 12, pp. 2493-2537, (2011)
[3] Dahl GE, Yu D, Deng L, Acero A., Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, IEEE Trans. on Audio, Speech, and Language Processing, 20, 1, pp. 30-42, (2012)
[4] Gu TY, Liu K, Dolan-Gavitt B, Garg S., BadNets: Evaluating backdooring attacks on deep neural networks, IEEE Access, 7, pp. 47230-47244, (2019)
[5] Liu YF, Ma XJ, Bailey J, Lu F., Reflection backdoor: A natural backdoor attack on deep neural networks, Proc. of the 16th European Conf. on Computer Vision, pp. 182-199, (2020)
[6] Li YM, Zhai TQ, Jiang Y, Li ZF, Xia ST., Backdoor attack in the physical world, (2021)
[7] Croce F, Hein M., Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks, Proc. of the 37th Int’l Conf. on Machine Learning. PMLR, pp. 2206-2216, (2020)
[8] Goodfellow IJ, Shlens J, Szegedy C., Explaining and harnessing adversarial examples, Proc. of the 3rd Int’l Conf. on Learning Representations, (2015)
[9] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A., Towards deep learning models resistant to adversarial attacks, Proc. of the 6th Int’l Conf. on Learning Representations, (2018)
[10] Bai JW, Chen B, Li YM, Wu DX, Guo WW, Xia ST, Yang EH., Targeted attack for deep hashing based retrieval, Proc. of the 16th European Conf. on Computer Vision, pp. 318-364, (2020)

← 1 2 3 4 5 6 →