Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

被引：7

作者：

Amich, Abderrahmen ^{[1
]}

Eshete, Birhanu ^{[1
]}

机构：

[1] Univ Michigan, Dearborn, MI 48128 USA

来源：

SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM 2021, PT I | 2021年 / 398卷

关键词：

Machine learning evasion; Explainable machine learning;

D O I：

10.1007/978-3-030-90019-9_11

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine Learning (ML) models are susceptible to evasion attacks. Evasion accuracy is typically assessed using aggregate evasion rate, and it is an open question whether aggregate evasion rate enables feature-level diagnosis on the effect of adversarial perturbations on evasive predictions. In this paper, we introduce a novel framework that harnesses explainable ML methods to guide high-fidelity assessment of ML evasion attacks. Our framework enables explanation-guided correlation analysis between pre-evasion perturbations and post-evasion explanations. Towards systematic assessment of ML evasion attacks, we propose and evaluate a novel suite of model-agnostic metrics for sample-level and dataset-level correlation analysis. Using malware and image classifiers, we conduct comprehensive evaluations across diverse model architectures and complementary feature representations. Our explanation-guided correlation analysis reveals correlation gaps between adversarial samples and the corresponding perturbations performed on them. Using a case study on explanation-guided evasion, we show the broader usage of our methodology for assessing robustness of ML models.

引用

页码：207 / 228

页数：22

共 50 条

[1] Best-Effort Adversarial Approximation of Black-Box Malware Classifiers [J].

Ali, Abdullah ;

Eshete, Birhanu .

SECURITY AND PRIVACY IN COMMUNICATION NETWORKS (SECURECOMM 2020), PT I, 2020, 335 :318-338

[2]

Anderson H.S, 2018, ARXIV PREPRINT ARXIV

[3]

Anderson HS, 2018, CORR

[4]

[Anonymous], 2020, CNET FREEWARE SITE

[5]

[Anonymous], 2018, CORR

[6]

[Anonymous], 2020, virus share

[7]

[Anonymous], 2020, Virus Total

[8] Deep Reinforcement Adversarial Learning Against Botnet Evasion Attacks [J].

Apruzzese, Giovanni ;

Andreolini, Mauro ;

Marchetti, Mirco ;

Venturi, Andrea ;

Colajanni, Michele .

IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2020, 17 (04) :1975-1987

[9] Wild patterns: Ten years after the rise of adversarial machine learning [J].

Biggio, Battista ;

Roli, Fabio .

PATTERN RECOGNITION, 2018, 84 :317-331

[10]

Burges C. J, 2020, MNIST DATABASE HANDW

← 1 2 3 4 5 →