Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors

被引：1

作者：

Chen, Zhi ^{[1
]}

Zhang, Zhenning ^{[1
]}

Kan, Zeliang ^{[2
,3
]}

Yang, Limin ^{[1
]}

Cortellazzi, Jacopo ^{[2
,3
]}

Pendlebury, Feargus ^{[3
]}

Pierazzi, Fabio ^{[2
]}

Cavallaro, Lorenzo ^{[3
]}

Wang, Gang ^{[1
]}

机构：

[1] Univ Illinois, Urbana, IL 61081 USA

[2] Kings Coll London, London, England

[3] UCL, London, England

来源：

2023 IEEE SECURITY AND PRIVACY WORKSHOPS, SPW | 2023年

关键词：

D O I：

10.1109/SPW59333.2023.00007

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while featurespace drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.

引用

页码：21 / 28

页数：8

共 50 条

[21] Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Chavan, Arnav
Lele, Nahush
Gupta, Deepak
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2389 - 2400
[22] Transcend: Detecting Concept Drift in Malware Classification Models
Jordaney, Roberto
Sharad, Kumar
Dash, Santanu Kumar
Wang, Zhi
Papini, Davide
Nouretdinov, Ilia
Cavallaro, Lorenzo
PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY '17), 2017, : 625 - 642
[23] Scattered Feature Space for Malware Analysis
Vinod, P.
Laxmi, V.
Gaur, M. S.
ADVANCES IN COMPUTING AND COMMUNICATIONS, PT I, 2011, 190 : 562 - 571
[24] Analyzing and Exploring Feature Detectors in Images
Drews, Paulo, Jr.
de Bem, Rodrigo
de Melo, Alexandre
2011 9TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2011,
[25] Unsupervised Feature-Space Domain Adaptation applied for Audio Classification
Bidarouni, Amir Latifi
Abesser, Jakob
2023 4TH INTERNATIONAL SYMPOSIUM ON THE INTERNET OF SOUNDS, 2023, : 317 - 323
[26] ANALYZING VIDEO CONCEPT DETECTORS VISUALLY
Snoek, Cees G. M.
van Balen, Richard
Koelma, Dennis C.
Smeulders, Arnold W. M.
Worring, Marcel
2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1603 - 1604
[27] Discriminative feature-space transforms using deep neural networks
Saon, George
Kingsbury, Brian
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 14 - 17
[28] Bregman pooling: feature-space local pooling for image classification
Najjar, Alameen
Ogawa, Takahiro
Haseyama, Miki
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2015, 4 (04) : 247 - 259
[29] Feature-Space Transformation Improves Supervised Segmentation Across Scanners
van Opbroek, Annegreet
Achterberg, Hakim C.
de Bruijne, Marleen
MACHINE LEARNING MEETS MEDICAL IMAGING, 2015, 9487 : 85 - 93
[30] Toward the development of a feature-space representation for a complex natural category domain
Nosofsky, Robert M.
Sanders, Craig A.
Meagher, Brian J.
Douglas, Bruce J.
BEHAVIOR RESEARCH METHODS, 2018, 50 (02) : 530 - 556

← 1 2 3 4 5 →