Detection of malicious code using the direct hashing and pruning and support vector machine

被引:6
作者
Ju, YeongJi [1 ]
Kim, MinGu [2 ]
Shin, JuHyun [3 ]
机构
[1] TmaxData Corp, Tech Support, Seoul, South Korea
[2] Chosun Univ, Dept Control & Instrumentat Engn, Gwangju, South Korea
[3] Chosun Univ, Dept New Ind Convergence, 375 Seosuk Dong, Gwangju 61452, South Korea
基金
新加坡国家研究基金会;
关键词
classification; direct hashing and pruning; malicious code; support vector machine;
D O I
10.1002/cpe.5483
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Although open application programming interfaces (APIs) have been improved by advancements in the software industry, diverse types of malicious code have also increased. Thus, many studies have been conducted to characterize the behavior of malicious code based on API data and to determine whether malicious code is included in a specific executable file. Existing methods detect malicious code by analyzing signature data. To detect mutated malicious code in this manner requires a lot of time and has a high false detection rate (see "Detection of malicious code using the FP-growth algorithm and SVM," a paper presented at The First International Conference on Software and Smart Convergence, 2017). Herein, we propose a method that analyzes and detects malicious code using association rule mining and a support vector machine (SVM). The proposed method reduces the false detection rate by mining the rules of malicious and normal code APIs in the portable executable (PE) file, grouping patterns using the direct hashing and pruning (DHP) algorithm, and classifying malicious and normal files using the SVM. The study shows that sensitivity was 71% and precision was 77% when using a single SVM model. Using the association rules and SVM model, the sensitivity was increased to 77% and the precision to 81%.
引用
收藏
页数:8
相关论文
共 20 条
[1]  
Agrawal R., 1994, P 20 INT C VER LARG
[2]  
[Anonymous], 1998, TECHNICAL REPORT
[3]  
Bell J., 2014, MACHINE LEARNING HAN
[4]  
Cisco, 2017, WHAT IS DIFF VIR WOR
[5]  
Han Kyungsu, 2011, [Journal of Security Engineering, 보안공학연구논문지], V8, P319
[6]  
Harrington P., 2012, Machine learning in action
[7]  
Hyung Bong Lee, 2008, Journal of KISS: Databases, V35, P199
[8]  
Ju YJ, 2017, 1 INT C SOFTW SMART
[9]  
Karyotis V., 2016, Malware Diffusion Models for Modern Complex Networks
[10]  
KIM Hyo-Nam, 2013, [Journal of The Korea Society of Computer and Information, 한국컴퓨터정보학회논문지], V18, P69