Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors

被引:1
|
作者
Chen, Zhi [1 ]
Zhang, Zhenning [1 ]
Kan, Zeliang [2 ,3 ]
Yang, Limin [1 ]
Cortellazzi, Jacopo [2 ,3 ]
Pendlebury, Feargus [3 ]
Pierazzi, Fabio [2 ]
Cavallaro, Lorenzo [3 ]
Wang, Gang [1 ]
机构
[1] Univ Illinois, Urbana, IL 61081 USA
[2] Kings Coll London, London, England
[3] UCL, London, England
来源
2023 IEEE SECURITY AND PRIVACY WORKSHOPS, SPW | 2023年
关键词
D O I
10.1109/SPW59333.2023.00007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while featurespace drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.
引用
收藏
页码:21 / 28
页数:8
相关论文
共 50 条
  • [41] Heterogeneous Feature Space for Android Malware Detection
    Varsha, M. V.
    Vinod, P.
    Dhanya, K. A.
    2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2015, : 383 - 388
  • [42] fMLLR based feature-space speaker adaptation of DNN acoustic models
    Parthasarathi, Hari Krishnan
    Hoffmeister, Bjorn
    Matsoukas, Spyros
    Mandal, Arindam
    Strom, Nikko
    Garimella
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3630 - 3634
  • [43] Efficiently solving the curse of feature-space dimensionality for improved peptide classification
    Negovetic, Mario
    Otovic, Erik
    Kalafatovic, Daniela
    Mausa, Goran
    DIGITAL DISCOVERY, 2024, 3 (06): : 1182 - 1193
  • [44] Transfer Learning across Feature-Rich Heterogeneous Feature Spaces via Feature-Space Remapping (FSR)
    Feuz, Kyle D.
    Cook, Diane J.
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2015, 6 (01)
  • [45] Constructing Statistically Unbiased Cortical Surface Templates Using Feature-Space Covariance
    Parvathaneni, Prasanna
    Lyu, Ilwoo
    Huo, Yuankai
    Blaber, Justin
    Hainline, Allison E.
    Kang, Hakmook
    Woodward, Neil D.
    Landman, Bennett A.
    MEDICAL IMAGING 2018: IMAGE PROCESSING, 2018, 10574
  • [46] The effects of categorical similarity and feature-space proximity on visual working memory processing
    Yang, Li
    Mo, Lei
    Wang, Xingchao
    Yu, Mengxia
    VISUAL COGNITION, 2018, 26 (02) : 100 - 114
  • [47] Make Split, not Hijack: Preventing Feature-Space Hijacking Attacks in Split Learning
    Khan, Tanveer
    Budzys, Mindaugas
    Michalas, Antonis
    PROCEEDINGS OF THE 29TH ACM SYMPOSIUM ON ACCESS CONTROL MODELS AND TECHNOLOGIES, SACMAT 2024, 2024, : 19 - 30
  • [48] Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models
    Tomashenko, Natalia
    Esteve, Yannick
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3163 - 3170
  • [49] FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification
    Yoo, Dayeon
    Kim, Jeesu
    Yoo, Jinwoo
    IEEE ACCESS, 2024, 12 : 72598 - 72606
  • [50] Decision-tree based feature-space quantization for fast Gaussian computation
    Padmanabhan, M
    Jan, EE
    Bahl, LR
    Picheny, M
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 325 - 330