Towards a fair comparison and realistic evaluation framework of android malware detectors based on static analysis and machine learning

被引：15

作者：

Molina-Coronado, Borja ^{[1
]}

Mori, Usue ^{[2
]}

Mendiburu, Alexander ^{[1
]}

Miguel-Alonso, Jose ^{[1
]}

机构：

[1] Univ Basque Country UPV EHU, Dept Comp Architecture & Technol, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain

[2] Univ Basque Country UPV EHU, Dept Comp Sci & Artificial Intelligence, Ps Manuel Lardizabal 1, Donostia San Sebastian 20018, Gipuzkoa, Spain

来源：

COMPUTERS & SECURITY | 2023年 / 124卷

关键词：

Android malware detection; Machine learning; Mobile security; Experimental analysis; Static analysis; OBFUSCATION; DISCOVERY; KNOWLEDGE; MODEL;

D O I：

10.1016/j.cose.2022.102996

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a standard evaluation framework make these proposals difficult to compare. In this paper, we perform an analysis of 10 influential research works on Android malware detection using a common evaluation framework. We have identified five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models and their performances. In particular, we analyze the effect of (1) the presence of duplicated samples, (2) label (goodware/greyware/malware) attribution, (3) class imbalance, (4) the presence of apps that use evasion techniques and, (5) the evolution of apps. Based on this extensive experimentation, we conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results. Our findings also highlight that it is imperative to generate realistic experimental scenarios, taking into account the aforementioned factors, to foster the rise of better ML-based Android malware detection solutions. (c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：16

共 50 条

[1] Static analysis framework for permission-based dataset generation and android malware detection using machine learning
Pathak, Amarjyoti
Kumar, Th. Shanta
Barman, Utpal
EURASIP JOURNAL ON INFORMATION SECURITY, 2024, 2024 (01):
[2] AdDroid: Rule-Based Machine Learning Framework for Android Malware Analysis
Mehtab, Anam
Shahid, Waleed Bin
Yaqoob, Tahreem
Amjad, Muhammad Faisal
Abbas, Haider
Afzal, Hammad
Saqib, Malik Najmus
MOBILE NETWORKS & APPLICATIONS, 2020, 25 (01) : 180 - 192
[3] AdDroid: Rule-Based Machine Learning Framework for Android Malware Analysis
Anam Mehtab
Waleed Bin Shahid
Tahreem Yaqoob
Muhammad Faisal Amjad
Haider Abbas
Hammad Afzal
Malik Najmus Saqib
Mobile Networks and Applications, 2020, 25 : 180 - 192
[4] Backdoor Attack on Machine Learning Based Android Malware Detectors
Li, Chaoran
Chen, Xiao
Wang, Derui
Wen, Sheng
Ahmed, Muhammad Ejaz
Camtepe, Seyit
Xiang, Yang
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (05) : 3357 - 3370
[5] Meizodon: Security Benchmarking Framework for Static Android Malware Detectors
Rodriguez, Sebastiaan Alvarez
van der Kouwe, Erik
THIRD CENTRAL EUROPEAN CYBERSECURITY CONFERENCE (CECC 2019), 2019,
[6] A Survey of Android Malware Static Detection Technology Based on Machine Learning
Wu, Qing
Zhu, Xueling
Liu, Bo
MOBILE INFORMATION SYSTEMS, 2021, 2021
[7] Android Malware Detection Based on Machine Learning
Wang, Qing-Fei
Fang, Xiang
2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 434 - 436
[8] A Review of Android Malware Detection Approaches Based on Machine Learning
Liu, Kaijun
Xu, Shengwei
Xu, Guoai
Zhang, Miao
Sun, Dawei
Liu, Haifeng
IEEE ACCESS, 2020, 8 (08): : 124579 - 124607
[9] Static and Dynamic Malware Analysis Using Machine Learning
Raghuraman, Chandni
Suresh, Sandhya
Shivshankar, Suraj
Chapaneri, Radhika
FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 793 - 806
[10] Machine learning based hybrid behavior models for Android malware analysis
Chuang, Hsin-Yu
Wang, Sheng-De
2015 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SECURITY AND RELIABILITY (QRS 2015), 2015, : 201 - 206

← 1 2 3 4 5 →