Building Visual Malware Dataset using VirusShare Data and Comparing Machine Learning Baseline Model to CoAtNet for Malware Classification

被引:0
|
作者
Bruzzese, Roberto R. [1 ]
机构
[1] Sapienza Univ Rome, Rome, Italy
来源
2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024 | 2024年
关键词
Malware; Machine Learning; Visual Images; CoAtNet;
D O I
10.1145/3651671.3651735
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The present work takes inspiration from the work of Zihang Dai, Hanxiao Liu, Quoc V. Le, Mingxing Tan at Google Research, Brain Team about CoAtNet. In that work it was showed that it is possible to combine the strengths from both convolution and transformer architectures, by unifying convnets and self-attention into a machine learning model. We want to apply the CoAtNet to a visual dataset of malware images and compare its performances to a baseline CNN model. For this reason we need a data set of appropriate size and format. From these needs triggers the requirement to find or generate a visual dataset of the malware images capable to measure the accuracy of the constructed model. As will be seen, the creation of a new dataset will be preferred to the search for an existing dataset. Although the visual approach has already been extensively tested in recent years, there is still a need for more customised data for the model under examination. The work described in this paper can serve as a guide to a balanced and dimensioned construction of an optimal malware visual image dataset.
引用
收藏
页码:185 / 193
页数:9
相关论文
共 50 条
  • [41] Gauss-Mapping Black Widow Optimization With Deep Extreme Learning Machine for Android Malware Classification Model
    Aldehim, Ghadah
    Arasi, Munya A.
    Khalid, Majdi
    Aljameel, Sumayh S.
    Marzouk, Radwa
    Mohsen, Heba
    Yaseen, Ishfaq
    Ibrahim, Sara Saadeldeen
    IEEE ACCESS, 2023, 11 : 87062 - 87070
  • [42] Machine Learning Approach for Malware Detection Using Random Forest Classifier on Process List Data Structure
    Joshi, Santosh
    Upadhyay, Himanshu
    Lagos, Leonel
    Akkipeddi, Naga Suryamitra
    Guerra, Valerie
    2ND INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2018), 2018, : 98 - 102
  • [43] Malware detection for IoT devices using hybrid system of whitelist and machine learning based on lightweight flow data
    Nakahara, Masataka
    Okui, Norihiro
    Kobayashi, Yasuaki
    Miyake, Yutaka
    Kubota, Ayumu
    ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (09)
  • [44] MalDy: Portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports
    Karbab, ElMouatez Billah
    Debbabi, Mourad
    DIGITAL INVESTIGATION, 2019, 28 : S77 - S87
  • [45] A Machine Learning Classification Approach to Detect TLS-based Malware using Entropy-based Flow Set Features
    Keshkeh, Kinan
    Jantan, Aman
    Alieyan, Kamal
    JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA, 2022, 21 (03): : 279 - 313
  • [46] Comparing Efficiency of Machine Learning and Deep Learning Methods for Octave Illusion Classification Using Magnetoencephalography Data
    Pilyugina N.
    Aizawa Y.
    Tsukahara A.
    Tanaka K.
    Transactions of Japanese Society for Medical and Biological Engineering, 2021, Annual 59 (Proc) : 650 - 652
  • [47] Fatal structure fire classification from building fire data using machine learning
    Balakrishnan, Vimala
    Hashim, Aainaa Nadia Mohammed
    Lee, Voon Chung
    Lee, Voon Hee
    Lee, Ying Qiu
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2024, 17 (02) : 236 - 252
  • [48] Building occupancy type classification and uncertainty estimation using machine learning and open data
    Narock, Tom
    Johnson, J. Michael
    Singh-Mohudpur, Justin
    Rad, Arash Modaresi
    ENVIRONMENTAL DATA SCIENCE, 2025, 4
  • [49] An Enhanced Novel GA-based Malware Detection in End Systems Using Structured and Unstructured Data by Comparing Support Vector Machine and Neural Network
    Reddy, T. Sai Tejeshwar
    Kumar, A. Sivanesh
    REVISTA GEINTEC-GESTAO INOVACAO E TECNOLOGIAS, 2021, 11 (02): : 1514 - 1525
  • [50] Machine Learning Based Classification of Depression Using Motor Activity Data and Autoregressive Model
    Schulte, Alexander
    Breiksch, Tim
    Brockmann, Jonas
    Bauer, Nadja
    GERMAN MEDICAL DATA SCIENCES 2022 - FUTURE MEDICINE: MORE PRECISE, MORE INTEGRATIVE, MORE SUSTAINABLE, 2022, 296 : 25 - 32