A Novel Image-Based Malware Classification Model Using Deep Learning

被引:5
作者
Jiang, Yongkang [1 ]
Li, Shenghong [1 ]
Wu, Yue [1 ]
Zou, Futai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
来源
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II | 2019年 / 11954卷
关键词
Malware; Embedding; Classification; Deep learning;
D O I
10.1007/978-3-030-36711-4_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, the vast volume of data which needs to be evaluated potentially malicious is becoming one of the major challenges of antivirus products. In this paper, we propose a novel image-based mal-ware classification model using deep learning to counter large-scale mal-ware analysis. The model includes a malware embedding method called YongImage which maps instruction-level information and disassembly metadata generated by IDA disassembler tool into an image vector, and a deep neural network named malVecNet which has simpler structure and faster convergence rate. Our proposed YongImage converts malware analysis tasks into image classification problems, which do not rely on domain knowledge and complex feature extraction. Meanwhile, we use the thought of sentence-level classification in Natural Language Processing to establish and optimize our malVecNet. Compared to previous work, malVecNet has better theoretical interpretability and can be trained more effectively. We use 10-fold cross-validation on Microsoft malware classification challenge dataset to evaluate our model. The results demonstrate that our model can achieve 99.49% accuracy with 0.022 log loss. Although our scheme is less precise than the winner's, it makes an orders-of-magnitude performance boost. Compared with other related work, our model also outperforms most of them.
引用
收藏
页码:150 / 161
页数:12
相关论文
共 20 条
  • [1] Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification
    Ahmadi, Mansour
    Ulyanov, Dmitry
    Semenov, Stanislav
    Trofimov, Mikhail
    Giacinto, Giorgio
    [J]. CODASPY'16: PROCEEDINGS OF THE SIXTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2016, : 183 - 194
  • [2] Andrew Davis M.W, 2015, DEEP LEARNING DISASS
  • [3] [Anonymous], 2018, SAM CYB ENG KIT
  • [4] [Anonymous], 2016, INT 64 IA 32 ARCH SO
  • [5] Polymorphic malware detection using sequence classification methods and ensembles: BioSTAR 2016 Recommended Submission - EURASIP Journal on Information Security
    Drew J.
    Hahsler M.
    Moore T.
    [J]. Drew, Jake (jakemdrew@gmail.com), 1600, Springer International Publishing (2017):
  • [6] Garcia F. C. C., 2016, CRYPTOGRAPHY SECURIT
  • [7] He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]
  • [8] Detecting Malware with an Ensemble Method Based on Deep Neural Network
    Yan, Jinpei
    Qi, Yong
    Rao, Qifan
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [9] Kebede TM, 2017, PROC NAECON IEEE NAT, P70, DOI 10.1109/NAECON.2017.8268747
  • [10] Image-Based Malware Classification Using Convolutional Neural Network
    Kim, Hae-Jung
    [J]. ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 1352 - 1357