Neural malware analysis with attention mechanism

被引:29
作者
Yakura, Hiromu [1 ,4 ]
Shinozaki, Shinnosuke [1 ]
Nishimura, Reon [1 ]
Oyama, Yoshihiro [2 ]
Sakuma, Jun [3 ,4 ,5 ]
机构
[1] Univ Tsukuba, Coll Informat Sci, Informat Engn, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Tsukuba, Ibaraki, Japan
[3] Univ Tsukuba, Sch Syst & Informat Engn, Dept Comp Sci, Tsukuba, Ibaraki, Japan
[4] RIKEN, Ctr Adv Intelligence Project, Tokyo, Japan
[5] JST CREST, Tokyo, Japan
关键词
Malware analysis; Convolutional neural network; Attention mechanism; Static analysis; Machine learning;
D O I
10.1016/j.cose.2019.101592
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives: In order to confront diverse types of malware that evolve from moment to moment, it is important to instantly acquire deep knowledge related to the characteristics of malware samples. This paper proposes a method by which to extract important byte sequences of a given malware sample that characterize the functionality of the sample, which reduces the workload of human analysts who investigate the functionality of the sample. Design & methods: By applying a convolutional neural network (CNN) with an attention mechanism to an image converted from binary data, the proposed method enables calculation of an attention map, which is expected to specify regions having higher importance for classification. This distinction of regions enables the extraction of characteristic byte sequences peculiar to the malware family from the binary data and can provide useful information for human analysts without a priori knowledge. Results: The results of an evaluation experiment using a malware dataset reveal that the sequences extracted by the proposed method provide useful information for manual analysis. For example, in the case of BackdoorWin32.Agobot. It, the region with the highest importance in the attention map points at a function to receive commands from a remote server via IRC. This result characterizes the behavior of its family, Worm:Win32/Gaobot, which executes commands sent via IRC to construct a botnet. Conclusions: By taking advantage of a CNN with the attention mechanism, the proposed method is shown to provide important regions in the binaries selectively for manual analysis of malware samples. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 67 条
[1]   Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification [J].
Ahmadi, Mansour ;
Ulyanov, Dmitry ;
Semenov, Stanislav ;
Trofimov, Mikhail ;
Giacinto, Giorgio .
CODASPY'16: PROCEEDINGS OF THE SIXTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2016, :183-194
[2]  
Alazab M., 2011, 7th International and 4th e-Democracy, Joint Conferences, P204
[3]   SIGMA: A Semantic Integrated Graph Matching Approach for identifying reused functions in binary code [J].
Alrabaee, Saed ;
Shirani, Paria ;
Wang, Lingyu ;
Debbabi, Mourad .
DIGITAL INVESTIGATION, 2015, 12 :S61-S71
[4]  
Anderson Blake., 2014, Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop, P103
[5]  
[Anonymous], 2015, PROC INT S FOUND PRA
[6]  
[Anonymous], 2018, 2018 INT JOINT C NEU
[7]  
[Anonymous], P 17 VIR B INT C
[8]  
[Anonymous], N A VX HEAVEN VIRUS
[9]  
[Anonymous], PHISHING TRIP BRAZIL
[10]  
[Anonymous], 2012, P INT C DET INTR MAL