A graph mining approach for detecting unknown malwares

被引:46
作者
Eskandari, Mojtaba [1 ]
Hashemi, Sattar [1 ]
机构
[1] Shiraz Univ, Dept Comp Sci & Engn, Shiraz, Iran
关键词
Malware; Detection; Unknown malwares; PE-file; CFG; API; SYSTEM;
D O I
10.1016/j.jvlc.2012.02.002
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Nowadays malware is one of the serious problems in the modern societies. Although the signature based malicious code detection is the standard technique in all commercial antivirus softwares, it can only achieve detection once the virus has already caused damage and it is registered. Therefore, it fails to detect new malwares (unknown malwares). Since most of malwares have similar behavior, a behavior based method can detect unknown malwares. The behavior of a program can be represented by a set of called API's (application programming interface). Therefore, a classifier can be employed to construct a learning model with a set of programs' API calls. Finally, an intelligent malware detection system is developed to detect unknown malwares automatically. On the other hand, we have an appealing representation model to visualize the executable files structure which is control flow graph (CFG). This model represents another semantic aspect of programs. This paper presents a robust semantic based method to detect unknown malwares based on combination of a visualize model (CFG) and called API's. The main contribution of this paper is extracting CFG from programs and combining it with extracted API calls to have more information about executable files. This new representation model is called API-CFG. In addition, to have fast learning and classification process, the control flow graphs are converted to a set of feature vectors by a nice trick. Our approach is capable of classifying unseen benign and malicious code with high accuracy. The results show a statistically significant improvement over n-grams based detection method. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:154 / 162
页数:9
相关论文
共 38 条
[1]   Control-Flow Integrity Principles, Implementations, and Applications [J].
Abadi, Martin ;
Budiu, Mihai ;
Erlingsson, Ulfar ;
Ligatti, Jay .
ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2009, 13 (01)
[2]  
Ai W., 1992, P 9 INT C MACH LEARN
[3]  
Arnold W., 1996, AUTOMATED ANAL COMPU
[4]  
Bergeron J., 1999, Proceedings. IEEE 8th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET ICE'99), P184, DOI 10.1109/ENABL.1999.805197
[5]  
Bonfante G., 2007, CONTROL FLOW DETECT
[6]   Architecture of a morphological malware detector [J].
Bonfante, Guillaume ;
Kaczmarek, Matthieu ;
Marion, Jean-Yves .
JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2009, 5 (03) :263-270
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]  
Bruschi D, 2006, LECT NOTES COMPUT SC, V4064, P129
[9]   A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost [J].
Cesare, Silvio ;
Xiang, Yang .
2010 24TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2010, :721-728
[10]  
Cleary J.G., 1995, PROC 12 INT C MACHIN, P108