Building Visual Malware Dataset using VirusShare Data and Comparing Machine Learning Baseline Model to CoAtNet for Malware Classification

被引:0
|
作者
Bruzzese, Roberto R. [1 ]
机构
[1] Sapienza Univ Rome, Rome, Italy
来源
2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024 | 2024年
关键词
Malware; Machine Learning; Visual Images; CoAtNet;
D O I
10.1145/3651671.3651735
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The present work takes inspiration from the work of Zihang Dai, Hanxiao Liu, Quoc V. Le, Mingxing Tan at Google Research, Brain Team about CoAtNet. In that work it was showed that it is possible to combine the strengths from both convolution and transformer architectures, by unifying convnets and self-attention into a machine learning model. We want to apply the CoAtNet to a visual dataset of malware images and compare its performances to a baseline CNN model. For this reason we need a data set of appropriate size and format. From these needs triggers the requirement to find or generate a visual dataset of the malware images capable to measure the accuracy of the constructed model. As will be seen, the creation of a new dataset will be preferred to the search for an existing dataset. Although the visual approach has already been extensively tested in recent years, there is still a need for more customised data for the model under examination. The work described in this paper can serve as a guide to a balanced and dimensioned construction of an optimal malware visual image dataset.
引用
收藏
页码:185 / 193
页数:9
相关论文
共 50 条
  • [1] Automatic malware classification and new malware detection using machine learning
    Liu Liu
    Bao-sheng Wang
    Bo Yu
    Qiu-xi Zhong
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1336 - 1347
  • [2] Automatic malware classification and new malware detection using machine learning
    Liu, Liu
    Wang, Bao-sheng
    Yu, Bo
    Zhong, Qiu-xi
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (09) : 1336 - 1347
  • [3] Malware Classification Using Probability Scoring and Machine Learning
    Xue, Di
    Li, Jingmei
    Lv, Tu
    Wu, Weifei
    Wang, Jiaxiang
    IEEE ACCESS, 2019, 7 : 91641 - 91656
  • [4] Analysis and Classification of Android Malware using Machine Learning Algorithms
    Tarar, Neha
    Sharma, Shweta
    Krishna, C. Rama
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2018), 2018, : 738 - 743
  • [5] Malware Classification Approaches Using Machine Learning Techniques: A Review
    Naik, Shivarti
    Dessai, Amita
    2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 111 - 117
  • [6] Analysis of Malware Behavior: Type Classification using Machine Learning
    Pirscoveanu, Radu S.
    Hansen, Steven S.
    Larsen, Thor M. T.
    Stevanovic, Matija
    Pedersen, Jens Myrup
    Czech, Alexandre
    2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015,
  • [7] Identifying Malware using Machine Learning Ensemble Model
    Bandlapalli, Saketh
    Janarthan, S. Nikhil
    Ragul, S.
    Sujatha, G.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [8] Machine-Learning Classifiers for Malware Detection Using Data Features
    Habtor, Saleh Abdulaziz
    Dahah, Ahmed Haidarah Hasan
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2021, 15 (03) : 265 - 290
  • [9] Analysis of Malware Behavior: Type Classification using Machine Learning
    Pirscoveanu, Radu S.
    Hansen, Steven S.
    Larsen, Thor M. T.
    Stevanovic, Matija
    Pedersen, Jens Myrup
    Czech, Alexandre
    2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015,
  • [10] Binary Malware image Classification using Machine Learning with Local Binary Pattern
    Luo, Jhu-Sin
    Lo, Dan Chia-Tien
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4664 - 4667