Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors

被引:1
|
作者
Chen, Zhi [1 ]
Zhang, Zhenning [1 ]
Kan, Zeliang [2 ,3 ]
Yang, Limin [1 ]
Cortellazzi, Jacopo [2 ,3 ]
Pendlebury, Feargus [3 ]
Pierazzi, Fabio [2 ]
Cavallaro, Lorenzo [3 ]
Wang, Gang [1 ]
机构
[1] Univ Illinois, Urbana, IL 61081 USA
[2] Kings Coll London, London, England
[3] UCL, London, England
来源
2023 IEEE SECURITY AND PRIVACY WORKSHOPS, SPW | 2023年
关键词
D O I
10.1109/SPW59333.2023.00007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while featurespace drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.
引用
收藏
页码:21 / 28
页数:8
相关论文
共 50 条
  • [21] Surgical Feature-Space Decomposition of LLMs: Why, When and How?
    Chavan, Arnav
    Lele, Nahush
    Gupta, Deepak
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2389 - 2400
  • [22] Transcend: Detecting Concept Drift in Malware Classification Models
    Jordaney, Roberto
    Sharad, Kumar
    Dash, Santanu Kumar
    Wang, Zhi
    Papini, Davide
    Nouretdinov, Ilia
    Cavallaro, Lorenzo
    PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY '17), 2017, : 625 - 642
  • [23] Scattered Feature Space for Malware Analysis
    Vinod, P.
    Laxmi, V.
    Gaur, M. S.
    ADVANCES IN COMPUTING AND COMMUNICATIONS, PT I, 2011, 190 : 562 - 571
  • [24] Analyzing and Exploring Feature Detectors in Images
    Drews, Paulo, Jr.
    de Bem, Rodrigo
    de Melo, Alexandre
    2011 9TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2011,
  • [25] Unsupervised Feature-Space Domain Adaptation applied for Audio Classification
    Bidarouni, Amir Latifi
    Abesser, Jakob
    2023 4TH INTERNATIONAL SYMPOSIUM ON THE INTERNET OF SOUNDS, 2023, : 317 - 323
  • [26] ANALYZING VIDEO CONCEPT DETECTORS VISUALLY
    Snoek, Cees G. M.
    van Balen, Richard
    Koelma, Dennis C.
    Smeulders, Arnold W. M.
    Worring, Marcel
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1603 - 1604
  • [27] Discriminative feature-space transforms using deep neural networks
    Saon, George
    Kingsbury, Brian
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 14 - 17
  • [28] Bregman pooling: feature-space local pooling for image classification
    Najjar, Alameen
    Ogawa, Takahiro
    Haseyama, Miki
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2015, 4 (04) : 247 - 259
  • [29] Feature-Space Transformation Improves Supervised Segmentation Across Scanners
    van Opbroek, Annegreet
    Achterberg, Hakim C.
    de Bruijne, Marleen
    MACHINE LEARNING MEETS MEDICAL IMAGING, 2015, 9487 : 85 - 93
  • [30] Toward the development of a feature-space representation for a complex natural category domain
    Nosofsky, Robert M.
    Sanders, Craig A.
    Meagher, Brian J.
    Douglas, Bruce J.
    BEHAVIOR RESEARCH METHODS, 2018, 50 (02) : 530 - 556