Towards a fair comparison and realistic evaluation framework of android malware detectors based on static analysis and machine learning

被引:15
|
作者
Molina-Coronado, Borja [1 ]
Mori, Usue [2 ]
Mendiburu, Alexander [1 ]
Miguel-Alonso, Jose [1 ]
机构
[1] Univ Basque Country UPV EHU, Dept Comp Architecture & Technol, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain
[2] Univ Basque Country UPV EHU, Dept Comp Sci & Artificial Intelligence, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain
关键词
Android malware detection; Machine learning; Mobile security; Experimental analysis; Static analysis; OBFUSCATION; DISCOVERY; KNOWLEDGE; MODEL;
D O I
10.1016/j.cose.2022.102996
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a standard evaluation framework make these proposals difficult to compare. In this paper, we perform an analysis of 10 influential research works on Android malware detection using a common evaluation framework. We have identified five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models and their performances. In particular, we analyze the effect of (1) the presence of duplicated samples, (2) label (goodware/greyware/malware) attribution, (3) class imbalance, (4) the presence of apps that use evasion techniques and, (5) the evolution of apps. Based on this extensive experimentation, we conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results. Our findings also highlight that it is imperative to generate realistic experimental scenarios, taking into account the aforementioned factors, to foster the rise of better ML-based Android malware detection solutions. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Android Malware Detection Using Hybrid Analysis and Machine Learning Technique
    Yang, Fan
    Zhuang, Yi
    Wang, Jun
    CLOUD COMPUTING AND SECURITY, PT II, 2017, 10603 : 565 - 575
  • [32] ANALYSIS OF FEATURES SELECTION AND MACHINE LEARNING CLASSIFIER IN ANDROID MALWARE DETECTION
    Mas'ud, Mohd Zaki
    Sahib, Shahrin
    Abdollah, Mohd Faizal
    Selamat, Siti Rahayu
    Yusof, Robiah
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA), 2014,
  • [33] Static Analysis of Android Malware Detection using Deep Learning
    Sandeep, H. R.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 841 - 845
  • [34] FEdroid: a lightweight and interpretable machine learning-based android malware detection system
    Huang, Hong
    Huang, Weitao
    Zhou, Yinghang
    Luo, Wengang
    Wang, Yunfei
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2025, 28 (04):
  • [35] Malware Detection: A Framework for Reverse Engineered Android Applications Through Machine Learning Algorithms
    Urooj, Beenish
    Shah, Munam Ali
    Maple, Carsten
    Abbasi, Muhammad Kamran
    Riasat, Sidra
    IEEE ACCESS, 2022, 10 : 89031 - 89050
  • [36] Integrating Static and Dynamic Malware Analysis Using Machine Learning
    Mangialardo, R. J.
    Duarte, J. C.
    IEEE LATIN AMERICA TRANSACTIONS, 2015, 13 (09) : 3080 - 3087
  • [37] MLDroid—framework for Android malware detection using machine learning techniques
    Arvind Mahindru
    A. L. Sangal
    Neural Computing and Applications, 2021, 33 : 5183 - 5240
  • [38] An Ensemble Approach Based on Fuzzy Logic Using Machine Learning Classifiers for Android Malware Detection
    Atacak, Ismail
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [39] Comprehensive Android Malware Detection: Leveraging Machine Learning and Sandboxing Techniques through Static and Dynamic Analysis
    Bhooshan, Prashant
    Darshan, Shiva S. L.
    Sonkar, Nidhi
    2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024, 2024, : 580 - 585
  • [40] Hybrid machine learning model for malware analysis in android apps
    Bashir, Saba
    Maqbool, Farwa
    Khan, Farhan Hassan
    Abid, Asif Sohail
    PERVASIVE AND MOBILE COMPUTING, 2024, 97