Memory-efficient detection of large-scale obfuscated malware

被引:0
作者
Wang Y. [1 ]
Zhang M. [1 ]
机构
[1] College of Computer Science and Technology, Jilin University, Jilin, Changchun
关键词
algorithm; malware; Naïve Bayes;
D O I
10.1504/IJWMC.2024.136586
中图分类号
学科分类号
摘要
Obfuscation techniques are frequently used in malicious programs to evade detection. However, current effective methods often require much memory space during training. This paper proposes a machine-learning-based solution to the malware detection problem that consumes fewer memory resources. We use hash and sparse matrix to build a text bag of words to reduce memory usage during training. Experiments show that our approach reduces the memory footprint by 95% when using 110,000 text data for confusion recognition training compared to the existing model. In the de-obfuscation step, our method improves the recognition accuracy of the import table function by 40%. Our model achieves shallow memory usage during confusion recognition training and enhances the accuracy of imported table recognition. Additionally, the confusion recognition accuracy is only about 10% lower than the confusion recognition model before the improvement. Copyright © 2024 Inderscience Enterprises Ltd.
引用
收藏
页码:48 / 60
页数:12
相关论文
共 71 条
  • [1] Abraham I., Malkhi D., Nayak K., Ren L., Yin M., Sync Hotstuff: simple and practical synchronous state machine replication, IEEE Symposium on Security and Privacy (SP), (2020)
  • [2] Ali Z., Soomro T.R., An efficient mining based approach using PSO selection technique for analysis and detection of obfuscated malware, Journal of Information Assurance and Cyber Security, pp. 1-13, (2018)
  • [3] Anderson H.S., Kharkar A., Filar B., Roth P., Evading machine learning malware detection, Black Hat, pp. 1-6, (2017)
  • [4] Beckman L., Haraldson A., Oskarsson O., Sandewall E., A partial evaluator, and its use as a programming tool, Artificial Intelligence, 7, 4, pp. 319-357, (1976)
  • [5] Boyer R.S., Elspas B., Levitt K.N., SELECT – a formal system for testing and debugging programs by symbolic execution, ACM SigPlan Notices, 10, 6, pp. 234-245, (1975)
  • [6] Cheng B., Ming J., Fu J., Peng G., Chen T., Zhang X., Marion J-Y., Towards paving the way for large-scale windows malware analysis: generic binary unpacking with orders-of-magnitude performance boost, Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 385-411, (2018)
  • [7] Cousot P., Cousot R., Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints, Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, (1977)
  • [8] Darem A., Abawajy J., Makkar A., Alhashmi A., Alanazi S., Visualization and deep-learning-based malware variant detection using OpCode-level features, Future Generation Computer Systems, 125, pp. 314-323, (2021)
  • [9] Dychka I., Tereikovskyi I., Tereikovska L., Pogorelov V., Mussiraliyeva S., Deobfuscation of computer virus malware code with value state dependence graph, International Conference on Computer Science, Engineering and Education Applications, (2018)
  • [10] Ernst M.D., Static and Dynamic Analysis: Synergy and Duality, WODA, (2003)