Application of Anomaly Detection Models to Malware Detection in the Presence of Concept Drift
被引:2
作者:
Escudero Garcia, David
论文数: 0引用数: 0
h-index: 0
机构:
Univ Leon, Res Inst Appl Sci Cybersecur, Campus Vegazana S-N, Leon 24071, SpainUniv Leon, Res Inst Appl Sci Cybersecur, Campus Vegazana S-N, Leon 24071, Spain
Escudero Garcia, David
[1
]
DeCastro-Garcia, Noemi
论文数: 0引用数: 0
h-index: 0
机构:
Univ Leon, Dept Math, Campus Vegazana S-N, Leon 24071, SpainUniv Leon, Res Inst Appl Sci Cybersecur, Campus Vegazana S-N, Leon 24071, Spain
DeCastro-Garcia, Noemi
[2
]
机构:
[1] Univ Leon, Res Inst Appl Sci Cybersecur, Campus Vegazana S-N, Leon 24071, Spain
[2] Univ Leon, Dept Math, Campus Vegazana S-N, Leon 24071, Spain
Machine learning is one of the main approaches to malware detection in the literature, since machine learning models are more adaptive than signature based solutions. One of the main challenges in the application of machine learning to malware detection is the presence of concept drift, which is a change in the data distribution over time. To tackle drift, online models that can be dynamically updated passively or by actively detecting change are applied. However, these models require new instances to be labelled to update the model. Usually, labels are scarce, cannot be obtained immediately and the presence of imbalance in the data make the construction of an effective model difficult. It has been studied that concept drift has a lower impact on benign instances, so we test the effectiveness of anomaly detection models to detect malware in the presence of concept drift. Anomaly detection models only need benign instances for training, and therefore may be less affected by the scarcity of labelled malicious instances. The results show that anomaly detection models achieve better results than supervised online models in conditions of heavy data imbalance and label scarcity.