Machine-Learning-Based Android Malware Family Classification Using Built-In and Custom Permissions

被引:8
作者
Kim, Minki [1 ]
Kim, Daehan [1 ]
Hwang, Changha [2 ]
Cho, Seongje [3 ]
Han, Sangchul [4 ]
Park, Minkyu [4 ]
机构
[1] Dankook Univ, Dept Data & Knowledge Serv Engn, Yongin 16890, South Korea
[2] Dankook Univ, Dept Stat, Yongin 16890, South Korea
[3] Dankook Univ, Dept Software Sci, Yongin 16890, South Korea
[4] Konkuk Univ, Dept Comp Engn, Chungju 27478, South Korea
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 21期
基金
新加坡国家研究基金会;
关键词
Android malware; malware family classification; machine learning; built-in permission; custom permission; balanced accuracy; Matthews correlation coefficient; DETECTION SYSTEM; FEATURES;
D O I
10.3390/app112110244
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Malware family classification is grouping malware samples that have the same or similar characteristics into the same family. It plays a crucial role in understanding notable malicious patterns and recovering from malware infections. Although many machine learning approaches have been devised for this problem, there are still several open questions including, "Which features, classifiers, and evaluation metrics are better for malware familial classification "? In this paper, we propose a machine learning approach to Android malware family classification using built-in and custom permissions. Each Android app must declare proper permissions to access restricted resources or to perform restricted actions. Permission declaration is an efficient and obfuscation-resilient feature for malware analysis. We developed a malware family classification technique using permissions and conducted extensive experiments with several classifiers on a well-known dataset, DREBIN. We then evaluated the classifiers in terms of four metrics: macrolevel F1-score, accuracy, balanced accuracy (BAC), and the Matthews correlation coefficient (MCC). BAC and the MCC are known to be appropriate for evaluating imbalanced data classification. Our experimental results showed that: (i) custom permissions had a positive impact on classification performance; (ii) even when the same classifier and the same feature information were used, there was a difference up to 3.67% between accuracy and BAC; (iii) LightGBM and AdaBoost performed better than other classifiers we considered.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Combining traditional machine learning and anomaly detection for several imbalanced Android malware dataset's classification
    Gan, Yiwei
    Han, Qian
    Gao, Yumeng
    PROCEEDINGS OF 2022 7TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2022, 2022, : 74 - 80
  • [32] A Survey on Android Malware Detection Techniques Using Supervised Machine Learning
    Altaha, Safa J.
    Aljughaiman, Ahmed
    Gul, Sonia
    IEEE ACCESS, 2024, 12 : 173168 - 173191
  • [33] A Machine-Learning-Based Framework for Supporting Malware Detection and Analysis
    Cuzzocrea, Alfredo
    Mercaldo, Francesco
    Martinelli, Fabio
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT III, 2021, 12951 : 353 - 365
  • [34] A review of detecting malware in android devices based on machine learning techniques
    Sharma, Monika
    Kaul, Ajay
    EXPERT SYSTEMS, 2024, 41 (01)
  • [35] An in-depth review of machine learning based Android malware detection
    Muzaffar, Ali
    Hassen, Hani Ragab
    Lones, Michael A.
    Zantout, Hind
    COMPUTERS & SECURITY, 2022, 121
  • [36] Android Malware Category and Family Classification Using Static Analysis
    Cong-Danh Nguyen
    Nghi Hoang Khoa
    Khoa Nguyen-Dang Doan
    Nguyen Tan Cam
    2023 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN, 2023, : 162 - 167
  • [37] Android Malware Family Classification and Characterization Using CFG and DFG
    Xu, Zhiwu
    Ren, Kerong
    Song, Fu
    2019 13TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2019), 2019, : 49 - 56
  • [38] Toward Semantic-Based Android Malware Detection Using Model Checking and Machine Learning
    El Hatib, Souad
    Ricaud, Loic
    Desharnais, Josee
    Tawbi, Nadia
    RISKS AND SECURITY OF INTERNET AND SYSTEMS (CRISIS 2020), 2021, 12528 : 289 - 307
  • [39] Android Malware Detection Using Category-Based Machine Learning Classifiers
    Alatwi, Huda Ali
    Oh, Tae
    Fokoue, Ernest
    Stackpole, Bill
    SIGITE'16: PROCEEDINGS OF THE 17TH ANNUAL CONFERENCE ON INFORMATION TECHNOLOGY EDUCATION, 2016, : 54 - 59
  • [40] URL-Based Dynamic Monitoring of Android Malware using Machine Learning
    Somarriba, Oscar
    Urbina, Henry Jaentschke
    PROCEEDINGS OF THE 2022 IEEE 40TH CENTRAL AMERICA AND PANAMA CONVENTION (CONCAPAN), 2022,