Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

被引:7
作者
Amich, Abderrahmen [1 ]
Eshete, Birhanu [1 ]
机构
[1] Univ Michigan, Dearborn, MI 48128 USA
来源
SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2021, PT I | 2021年 / 398卷
关键词
Machine learning evasion; Explainable machine learning;
D O I
10.1007/978-3-030-90019-9_11
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) models are susceptible to evasion attacks. Evasion accuracy is typically assessed using aggregate evasion rate, and it is an open question whether aggregate evasion rate enables feature-level diagnosis on the effect of adversarial perturbations on evasive predictions. In this paper, we introduce a novel framework that harnesses explainable ML methods to guide high-fidelity assessment of ML evasion attacks. Our framework enables explanation-guided correlation analysis between pre-evasion perturbations and post-evasion explanations. Towards systematic assessment of ML evasion attacks, we propose and evaluate a novel suite of model-agnostic metrics for sample-level and dataset-level correlation analysis. Using malware and image classifiers, we conduct comprehensive evaluations across diverse model architectures and complementary feature representations. Our explanation-guided correlation analysis reveals correlation gaps between adversarial samples and the corresponding perturbations performed on them. Using a case study on explanation-guided evasion, we show the broader usage of our methodology for assessing robustness of ML models.
引用
收藏
页码:207 / 228
页数:22
相关论文
共 50 条
[1]   Best-Effort Adversarial Approximation of Black-Box Malware Classifiers [J].
Ali, Abdullah ;
Eshete, Birhanu .
SECURITY AND PRIVACY IN COMMUNICATION NETWORKS (SECURECOMM 2020), PT I, 2020, 335 :318-338
[2]  
Anderson H.S, 2018, ARXIV PREPRINT ARXIV
[3]  
Anderson HS, 2018, CORR
[4]  
[Anonymous], 2020, CNET FREEWARE SITE
[5]  
[Anonymous], 2018, CORR
[6]  
[Anonymous], 2020, virus share
[7]  
[Anonymous], 2020, Virus Total
[8]   Deep Reinforcement Adversarial Learning Against Botnet Evasion Attacks [J].
Apruzzese, Giovanni ;
Andreolini, Mauro ;
Marchetti, Mirco ;
Venturi, Andrea ;
Colajanni, Michele .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2020, 17 (04) :1975-1987
[9]   Wild patterns: Ten years after the rise of adversarial machine learning [J].
Biggio, Battista ;
Roli, Fabio .
PATTERN RECOGNITION, 2018, 84 :317-331
[10]  
Burges C. J, 2020, MNIST DATABASE HANDW