Cassandra: Detecting Trojaned Networks From Adversarial Perturbations

被引：6

作者：

Zhang, Xiaoyu ^{[1
]}

Gupta, Rohit ^{[2
]}

Mian, Ajmal ^{[3
]}

Rahnavard, Nazanin ^{[4
]}

Shah, Mubarak ^{[2
]}

机构：

[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

[2] Univ Cent Florida, Ctr Res Comp Vis, Orlando, FL 32816 USA

[3] Univ Western Australia, Dept Comp Sci & Software Engn, Perth, WA 6009, Australia

[4] Univ Cent Florida, Dept Elect Engn, Orlando, FL 32816 USA

来源：

IEEE ACCESS | 2021年 / 9卷 / 09期

基金：

澳大利亚研究理事会;

关键词：

Trojan horses; Perturbation methods; Computational modeling; Training; Data models; Feature extraction; Detectors; Deep learning; adversarial attack; backdoor detection; computer vision;

D O I：

10.1109/ACCESS.2021.3101289

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks are being widely deployed for critical tasks. In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors. These malicious behaviors can be triggered at the adversary's will, which is a serious security threat. To verify the integrity of a deep model, we propose a method that captures its fingerprint with adversarial perturbations. Inserting backdoors into a network alters its decision boundaries which are effectively encoded by adversarial perturbations. Our proposed Trojan detection network learns features from adversarial patterns and its properties to encode the unknown trigger shape and deviations in the decision boundaries caused by backdoors. Our method works completely without or with limited clean samples for improved performance. Our method also performs anomaly detection to identify the target class of a Trojaned network and is invariant to the trigger type, trigger size, network architecture and does not require any triggered samples. Experiments are performed on MNIST, NIST-TrojAI and Odysseus datasets, with 5000 pre-trained models in total, making this the largest study to date on Trojaned detection and the new state-of-the-art accuracy is achieved.

引用

页码：135856 / 135867

页数：12

共 50 条

[1] Detecting Adversarial Perturbations with Salieny
Zhang, Chiliang
Yang, Zhimou
Ye, Zuochang
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: IOT AND SMART CITY (ICIT 2018), 2018, : 25 - 30
[2] Detecting Adversarial Perturbations with Saliency
Zhang, Chiliang
Ye, Zuochang
Wang, Yan
Yang, Zhimou
2018 IEEE 3RD INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2018, : 271 - 275
[3] Detecting backdoor in deep neural networks via intentional adversarial perturbations
Xue, Mingfu
Wu, Yinghao
Wu, Zhiyu
Zhang, Yushu
Wang, Jian
Liu, Weiqiang
INFORMATION SCIENCES, 2023, 634 : 564 - 577
[4] HYBRID DEFENSE FOR DEEP NEURAL NETWORKS: AN INTEGRATION OF DETECTING AND CLEANING ADVERSARIAL PERTURBATIONS
Fan, Weiqi
Sun, Guangling
Su, Yuying
Liu, Zhi
Lu, Xiaofeng
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 210 - 215
[5] Detecting and Mitigating Adversarial Perturbations for Robust Face Recognition
Gaurav Goswami
Akshay Agarwal
Nalini Ratha
Richa Singh
Mayank Vatsa
International Journal of Computer Vision, 2019, 127 : 719 - 742
[6] Detecting and Mitigating Adversarial Perturbations for Robust Face Recognition
Goswami, Gaurav
Agarwal, Akshay
Ratha, Nalini
Sing, Richa
Vatsa, Mayank
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (6-7) : 719 - 742
[7] Detecting Adversarial Perturbations in Multi-Task Perception
Klingner, Marvin
Kumar, Varun Ravi
Yogamani, Senthil
Baer, Andreas
Fingscheidt, Tim
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 13050 - 13057
[8] Trustworthy adaptive adversarial perturbations in social networks
Zhang, Jiawei
Wang, Jinwei
Wang, Hao
Luo, Xiangyang
Ma, Bin
JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2024, 80
[9] Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations
Wong, Alex
Mundhra, Mukund
Soatto, Stefano
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2879 - 2888
[10] Detecting Adversarial Perturbations Through Spatial Behavior in Activation Spaces
Katzir, Ziv
Elovici, Yuval
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 4 5 →