On the use of artificial malicious patterns for android malware detection

被引:41
作者
Jerbi, Manel [1 ]
Dagdia, Zaineb Chelly [2 ,3 ]
Bechikh, Slim [1 ]
Ben Said, Lamjed [1 ]
机构
[1] Univ Tunis, SMART Lab, ISG Campus, Tunis, Tunisia
[2] Univ Lorraine, LORIA, INRIA, CNRS, F-54000 Nancy, France
[3] Inst Super Gest Tunis, LARODEC, Tunis, Tunisia
基金
欧盟地平线“2020”;
关键词
Malware detection; API call sequences; Artificial malicious patterns; Evolutionary algorithm; Android; CLASSIFICATION; SYSTEM;
D O I
10.1016/j.cose.2020.101743
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malware programs currently represent the most serious threat to computer information systems. Despite the performed efforts of researchers in this field, detection tools still have limitations for one main reason. Actually, malware developers usually use obfuscation techniques consisting in a set of transformations that make the code and/or its execution difficult to analyze by hindering both manual and automated inspections. These techniques allow the malware to escape the detection tools, and hence to be seen as a benign program. To solve the obfuscation issue, many researchers have proposed to extract frequent Application Programming Interface (API) call sequences from previously encountered malware programs using pattern mining techniques and hence, build a base of fraudulent behaviors. Based on this process, it is worth mentioning that the performance of the detection process heavily depends on the base of examples of malware behaviors; also called malware patterns. In order to deal with this shortcoming, a dynamic detection method called Artificial Malware-based Detection (AMD) is proposed in this paper. AMD makes use of not only extracted malware patterns but also artificially generated ones. The artificial malware patterns are generated using an evolutionary (genetic) algorithm. The latter evolves a population of API call sequences with the aim to find new malware behaviors following a set of well-defined evolution rules. The artificial fraudulent behaviors are subsequently inserted into the base of examples in order to enrich it with unseen malware patterns. The main motivation behind the proposed AMD approach is to diversify the base of malware examples in order to maximize the detection rate. AMD has been tested on different Android malware data sets and compared against recent prominent works using commonly employed performance metrics. The performance analysis of the obtained results shows the merits of our AMD novel approach. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:22
相关论文
共 55 条
[11]   Wild patterns: Ten years after the rise of adversarial machine learning [J].
Biggio, Battista ;
Roli, Fabio .
PATTERN RECOGNITION, 2018, 84 :317-331
[12]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[13]  
Burguera I., 2011, P 1 ACM WORKSHOP SEC, P15
[14]  
Chaba S, 2017, ARXIV170908805
[15]  
Davis L., 1991, Handbook of Genetic Algorithms
[16]  
Di Cerbo F, 2011, LECT NOTES COMPUT SC, V6540, P138, DOI 10.1007/978-3-642-19376-7_12
[17]  
Edge KS, 2006, GECCO 2006: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, P103
[18]  
Felt AP, 2011, PROCEEDINGS OF THE 18TH ACM CONFERENCE ON COMPUTER & COMMUNICATIONS SECURITY (CCS 11), P627
[19]   Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors [J].
Fredrikson, Matt ;
Jha, Somesh ;
Christodorescu, Mihai ;
Sailer, Reiner ;
Yan, Xifeng .
2010 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, 2010, :45-60
[20]  
Ganin Y, 2016, J MACH LEARN RES, V17