Malware visualization methods based on deep convolution neural networks

被引:21
作者
Ren, Zhuojun [1 ]
Chen, Guang [1 ]
Lu, Wenke [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Malware visualization; Space filling curves; Convolution neural networks; Deep learning; Transfer learning;
D O I
10.1007/s11042-019-08310-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose two visualization methods for malware analysis based on n-gram features of byte sequences. The space filling curve mapping (SFCM) method uses fractal curves to visualize the one-gram features of byte sequences, i.e. malware files themselves, and distinguishes the printable characters from non-printable ones by different colors. This method addresses the issues that the existing methods cannot interactively locate characters and avoid the risk of the Decompression Bomb attack caused by large malware. The Markov dot plot (MDP) method visualizes the bi-gram features and their statistical information of byte sequences as the coordinates and brightness of the pixels and solves the problem that the relocation of code sections or the addition of redundant information helps malware escape the global image detection. The two methods are applied to the Microsoft malware samples (BIG 2015| Kaggle) and their visualized results are learned by the deep convolution networks to extract image features used for classification by SVM (support vector machine). In terms of malware classification, our methods obtained 98.36% and 99.08% classification accuracy, respectively. We also visualized the benign PE (portable executable) files in the Windows OS and verified them with the above malware set. In terms of malware detection, the two methods obtained 99.21% and 98.74% detection accuracy, respectively. These results are better than the existing grayscale method.
引用
收藏
页码:10975 / 10993
页数:19
相关论文
共 34 条
  • [1] [Anonymous], DEC BOMB VULN
  • [2] [Anonymous], 2017, SYMANTEC CORP
  • [3] [Anonymous], 2010 IEEE INT S CIRC
  • [4] Curvilinear space-filling curves for five-axis machining
    Anotaipaiboon, Weerachai
    Makhanov, Stanislav S.
    [J]. COMPUTER-AIDED DESIGN, 2008, 40 (03) : 350 - 367
  • [5] Bayer U, 2006, J COMPUT VIROL HACKI, V2, P67, DOI 10.1007/s11416-006-0012-2
  • [6] Visual Analysis of Nonlinear Dynamical Systems: Chaos, Fractals, Self-Similarity and the Limits of Prediction
    Boeing, Geoff
    [J]. SYSTEMS, 2016, 4 (04):
  • [7] Böhm C, 1999, LECT NOTES COMPUT SC, V1651, P75
  • [8] Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments
    Chiang, Wei-Lin
    Lee, Mu-Chu
    Lin, Chih-Jen
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1485 - 1494
  • [9] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [10] Conti G, 2008, LECT NOTES COMPUT SC, V5210, P1, DOI 10.1007/978-3-540-85933-8_1