Towards a fair comparison and realistic evaluation framework of android malware detectors based on static analysis and machine learning

被引:15
|
作者
Molina-Coronado, Borja [1 ]
Mori, Usue [2 ]
Mendiburu, Alexander [1 ]
Miguel-Alonso, Jose [1 ]
机构
[1] Univ Basque Country UPV EHU, Dept Comp Architecture & Technol, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain
[2] Univ Basque Country UPV EHU, Dept Comp Sci & Artificial Intelligence, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain
关键词
Android malware detection; Machine learning; Mobile security; Experimental analysis; Static analysis; OBFUSCATION; DISCOVERY; KNOWLEDGE; MODEL;
D O I
10.1016/j.cose.2022.102996
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a standard evaluation framework make these proposals difficult to compare. In this paper, we perform an analysis of 10 influential research works on Android malware detection using a common evaluation framework. We have identified five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models and their performances. In particular, we analyze the effect of (1) the presence of duplicated samples, (2) label (goodware/greyware/malware) attribution, (3) class imbalance, (4) the presence of apps that use evasion techniques and, (5) the evolution of apps. Based on this extensive experimentation, we conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results. Our findings also highlight that it is imperative to generate realistic experimental scenarios, taking into account the aforementioned factors, to foster the rise of better ML-based Android malware detection solutions. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Static analysis framework for permission-based dataset generation and android malware detection using machine learning
    Pathak, Amarjyoti
    Kumar, Th. Shanta
    Barman, Utpal
    EURASIP JOURNAL ON INFORMATION SECURITY, 2024, 2024 (01):
  • [2] AdDroid: Rule-Based Machine Learning Framework for Android Malware Analysis
    Mehtab, Anam
    Shahid, Waleed Bin
    Yaqoob, Tahreem
    Amjad, Muhammad Faisal
    Abbas, Haider
    Afzal, Hammad
    Saqib, Malik Najmus
    MOBILE NETWORKS & APPLICATIONS, 2020, 25 (01) : 180 - 192
  • [3] AdDroid: Rule-Based Machine Learning Framework for Android Malware Analysis
    Anam Mehtab
    Waleed Bin Shahid
    Tahreem Yaqoob
    Muhammad Faisal Amjad
    Haider Abbas
    Hammad Afzal
    Malik Najmus Saqib
    Mobile Networks and Applications, 2020, 25 : 180 - 192
  • [4] Backdoor Attack on Machine Learning Based Android Malware Detectors
    Li, Chaoran
    Chen, Xiao
    Wang, Derui
    Wen, Sheng
    Ahmed, Muhammad Ejaz
    Camtepe, Seyit
    Xiang, Yang
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (05) : 3357 - 3370
  • [5] Meizodon: Security Benchmarking Framework for Static Android Malware Detectors
    Rodriguez, Sebastiaan Alvarez
    van der Kouwe, Erik
    THIRD CENTRAL EUROPEAN CYBERSECURITY CONFERENCE (CECC 2019), 2019,
  • [6] A Survey of Android Malware Static Detection Technology Based on Machine Learning
    Wu, Qing
    Zhu, Xueling
    Liu, Bo
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [7] Android Malware Detection Based on Machine Learning
    Wang, Qing-Fei
    Fang, Xiang
    2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 434 - 436
  • [8] A Review of Android Malware Detection Approaches Based on Machine Learning
    Liu, Kaijun
    Xu, Shengwei
    Xu, Guoai
    Zhang, Miao
    Sun, Dawei
    Liu, Haifeng
    IEEE ACCESS, 2020, 8 (08): : 124579 - 124607
  • [9] Static and Dynamic Malware Analysis Using Machine Learning
    Raghuraman, Chandni
    Suresh, Sandhya
    Shivshankar, Suraj
    Chapaneri, Radhika
    FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 793 - 806
  • [10] Machine learning based hybrid behavior models for Android malware analysis
    Chuang, Hsin-Yu
    Wang, Sheng-De
    2015 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SECURITY AND RELIABILITY (QRS 2015), 2015, : 201 - 206