A Textual Backdoor Defense Method Based on Deep Feature Classification

被引:1
作者
Shao, Kun [1 ]
Yang, Junan [1 ]
Hu, Pengjiang [1 ]
Li, Xiaoshuai [1 ]
机构
[1] Natl Univ Def Technol, Coll Elect Engn, Hefei 230037, Peoples R China
关键词
deep neural networks; natural language processing; adversarial machine learning; backdoor attacks; backdoor defenses; ATTACKS;
D O I
10.3390/e25020220
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Natural language processing (NLP) models based on deep neural networks (DNNs) are vulnerable to backdoor attacks. Existing backdoor defense methods have limited effectiveness and coverage scenarios. We propose a textual backdoor defense method based on deep feature classification. The method includes deep feature extraction and classifier construction. The method exploits the distinguishability of deep features of poisoned data and benign data. Backdoor defense is implemented in both offline and online scenarios. We conducted defense experiments on two datasets and two models for a variety of backdoor attacks. The experimental results demonstrate the effectiveness of this defense approach and outperform the baseline defense method.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion
    Abdi, Asad
    Shamsuddin, Siti Mariyam
    Hasan, Shafaatunnur
    Piran, Jalil
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (04) : 1245 - 1259
  • [22] A Review on Medical Textual Question Answering Systems Based on Deep Learning Approaches
    Mutabazi, Emmanuel
    Ni, Jianjun
    Tang, Guangyi
    Cao, Weidong
    APPLIED SCIENCES-BASEL, 2021, 11 (12):
  • [23] Deep Feature Extraction and Classification of Android Malware Images
    Singh, Jaiteg
    Thakur, Deepak
    Ali, Farman
    Gera, Tanya
    Kwak, Kyung Sup
    SENSORS, 2020, 20 (24) : 1 - 29
  • [24] SecureNet: Proactive intellectual property protection and model security defense for DNNs based on backdoor learning
    Li, Peihao
    Huang, Jie
    Wu, Huaqing
    Zhang, Zeping
    Qi, Chunyang
    NEURAL NETWORKS, 2024, 174
  • [25] Effective defense against physically embedded backdoor attacks via clustering-based filtering
    Mohammed Kutbi
    Complex & Intelligent Systems, 2025, 11 (6)
  • [26] Latent Space-Based Backdoor Attacks Against Deep Neural Networks
    Kristanto, Adrian
    Wang, Shuo
    Rudolph, Carsten
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [27] Backdoor attacks against deep reinforcement learning based traffic signal control systems
    Heng Zhang
    Jun Gu
    Zhikun Zhang
    Linkang Du
    Yongmin Zhang
    Yan Ren
    Jian Zhang
    Hongran Li
    Peer-to-Peer Networking and Applications, 2023, 16 : 466 - 474
  • [28] Backdoor attacks against deep reinforcement learning based traffic signal control systems
    Zhang, Heng
    Gu, Jun
    Zhang, Zhikun
    Du, Linkang
    Zhang, Yongmin
    Ren, Yan
    Zhang, Jian
    Li, Hongran
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2023, 16 (01) : 466 - 474
  • [29] Deep Neural Network Based Hyperspectral Pixel Classification With Factorized Spectral-Spatial Feature Representation
    Chen, Jingzhou
    Chen, Siyu
    Zhou, Peilin
    Qian, Yuntao
    IEEE ACCESS, 2019, 7 : 81407 - 81418
  • [30] A novel method for feature learning and network intrusion classification
    Alzahrani, Ahmed S.
    Shah, Reehan Ali
    Qian, Yuntao
    Ali, Munwar
    ALEXANDRIA ENGINEERING JOURNAL, 2020, 59 (03) : 1159 - 1169