Collaborative Defense-GAN for protecting adversarial attacks on classification system

被引:15
作者
Laykaviriyakul, Pranpaveen [1 ]
Phaisangittisagul, Ekachai [1 ]
机构
[1] Kasetsart Univ, Dept Elect Engn, Fac Engn, Bangkok, Thailand
关键词
Adversarial attacks; Adversarial samples; Attack generator; Defense generator; Black-box attack; White-box attack;
D O I
10.1016/j.eswa.2022.118957
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With rapid progress and significant successes in a wide domain of applications, deep learning has been extensively employed for solving complex problems. However, performance of deep learning has been vulnerable to well-designed samples, called adversarial samples. These samples are carefully designed to deceive the deep learning models without human perception. Therefore, vulnerability to adversarial attacks becomes one of the major concerns in life-critical applications of deep learning. In this paper, a novel approach to counter adversarial samples is proposed to strengthen the robustness of a deep learning model. The strategy is to filter the perturbation noise in adversarial samples prior to prediction. The proposed defense framework is based on DiscoGANs to discover the relation between attacker and defender characteristics. Attacker models are created to generate the adversarial samples from the training data, while the defender model is trained to reconstruct original samples from the adversarial samples. These two frameworks are trained to compete with each other in an alternating manner. The experimental results on different attack models are compared with popular defense mechanisms on three benchmark datasets. Our proposed method shows promising results and can improve the robustness on both white-box and black-box attacks including the computation time.
引用
收藏
页数:15
相关论文
共 28 条
  • [1] Towards Evaluating the Robustness of Neural Networks
    Carlini, Nicholas
    Wagner, David
    [J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
  • [2] Audio Adversarial Examples: Targeted Attacks on Speech-to-Text
    Carlini, Nicholas
    Wagner, David
    [J]. 2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 1 - 7
  • [3] Ding G. W., 2019, arXiv
  • [4] Robust Physical-World Attacks on Deep Learning Visual Classification
    Eykholt, Kevin
    Evtimov, Ivan
    Fernandes, Earlence
    Li, Bo
    Rahmati, Amir
    Xiao, Chaowei
    Prakash, Atul
    Kohno, Tadayoshi
    Song, Dawn
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1625 - 1634
  • [5] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
  • [6] Guo C., 2018, ARXIV
  • [7] Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, DOI 10.48550/ARXIV.1503.02531]
  • [8] Goodfellow IJ, 2015, Arxiv, DOI arXiv:1412.6572
  • [9] Kim T, 2017, PR MACH LEARN RES, V70
  • [10] Kingma DP, 2014, ADV NEUR IN, V27