Machine-Learning-Based Android Malware Family Classification Using Built-In and Custom Permissions

被引:8
|
作者
Kim, Minki [1 ]
Kim, Daehan [1 ]
Hwang, Changha [2 ]
Cho, Seongje [3 ]
Han, Sangchul [4 ]
Park, Minkyu [4 ]
机构
[1] Dankook Univ, Dept Data & Knowledge Serv Engn, Yongin 16890, South Korea
[2] Dankook Univ, Dept Stat, Yongin 16890, South Korea
[3] Dankook Univ, Dept Software Sci, Yongin 16890, South Korea
[4] Konkuk Univ, Dept Comp Engn, Chungju 27478, South Korea
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 21期
基金
新加坡国家研究基金会;
关键词
Android malware; malware family classification; machine learning; built-in permission; custom permission; balanced accuracy; Matthews correlation coefficient; DETECTION SYSTEM; FEATURES;
D O I
10.3390/app112110244
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Malware family classification is grouping malware samples that have the same or similar characteristics into the same family. It plays a crucial role in understanding notable malicious patterns and recovering from malware infections. Although many machine learning approaches have been devised for this problem, there are still several open questions including, "Which features, classifiers, and evaluation metrics are better for malware familial classification "? In this paper, we propose a machine learning approach to Android malware family classification using built-in and custom permissions. Each Android app must declare proper permissions to access restricted resources or to perform restricted actions. Permission declaration is an efficient and obfuscation-resilient feature for malware analysis. We developed a malware family classification technique using permissions and conducted extensive experiments with several classifiers on a well-known dataset, DREBIN. We then evaluated the classifiers in terms of four metrics: macrolevel F1-score, accuracy, balanced accuracy (BAC), and the Matthews correlation coefficient (MCC). BAC and the MCC are known to be appropriate for evaluating imbalanced data classification. Our experimental results showed that: (i) custom permissions had a positive impact on classification performance; (ii) even when the same classifier and the same feature information were used, there was a difference up to 3.67% between accuracy and BAC; (iii) LightGBM and AdaBoost performed better than other classifiers we considered.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Improvement of feature set based on Apriori algorithm in Android malware classification using machine learning method
    Le Duc Thuan
    Pham Van Huong
    Hoang Van Hiep
    Nguyen Kim Khanh
    2020 RIVF INTERNATIONAL CONFERENCE ON COMPUTING & COMMUNICATION TECHNOLOGIES (RIVF 2020), 2020, : 185 - 191
  • [22] A Comparison of Machine and Deep Learning Models for Detection and Classification of Android Malware Traffic
    Bovenzi, Giampaolo
    Cerasuolo, Francesco
    Montieri, Antonio
    Nascita, Alfredo
    Persico, Valerio
    Pescape, Antonio
    2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [23] Malware Detection and Classification in Android Application Using Simhash-Based Feature Extraction and Machine Learning
    Al-Kahla, Wafaa
    Taqieddin, Eyad
    Shatnawi, Ahmed S.
    Al-Ouran, Rami
    IEEE ACCESS, 2024, 12 : 174255 - 174273
  • [24] Android malware classification based on ANFIS with fuzzy c-means clustering using significant application permissions
    Altaher, Altyeb
    Barukab, Omar
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (03) : 2232 - 2242
  • [25] A Hybrid Analysis-Based Approach to Android Malware Family Classification
    Ding, Chao
    Luktarhan, Nurbol
    Lu, Bei
    Zhang, Wenhui
    ENTROPY, 2021, 23 (08)
  • [26] Android Malware Classification Using Machine Learning and Bio-Inspired Optimisation Algorithms
    Pye, Jack
    Issac, Biju
    Aslam, Nauman
    Rafiq, Husnain
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1777 - 1782
  • [27] Histogram Entropy Representation and Prototype Based Machine Learning Approach for Malware Family Classification
    Baek, Byunghyun
    Euh, Seoungyul
    Baek, Dongheon
    Kim, Donghoon
    Hwang, Doosung
    IEEE ACCESS, 2021, 9 : 152098 - 152114
  • [28] Machine-learning-based classification of Glioblastoma in multiparametric MRI
    Cui, Ge
    Jeong, Jiwoong Jason
    Lei, Yang
    Wang, Tonghe
    Liu, Tian
    Curran, Walter J.
    Mao, Hui
    Yang, Xiaofeng
    MEDICAL IMAGING 2019: COMPUTER-AIDED DIAGNOSIS, 2019, 10950
  • [29] Detecting Android Malware Based on Extreme Learning Machine
    Sun, Yuxia
    Xie, Yunlong
    Qiu, Zhi
    Pan, Yuchang
    Weng, Jian
    Guo, Song
    2017 IEEE 15TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 15TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 3RD INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS(DASC/PICOM/DATACOM/CYBERSCI, 2017, : 47 - 53
  • [30] Poster: Android Malware Detection using Hybrid Features and Machine Learning
    Kadir, Abdul
    Peddoju, Sateesh K.
    2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024, 2024, : 494 - 495