Malware Detection Using Deep Learning and Correlation-Based Feature Selection

被引:53
作者
Alomari, Esraa Saleh [1 ]
Nuiaa, Riyadh Rahef [1 ]
Alyasseri, Zaid Abdi Alkareem [2 ,3 ,4 ]
Mohammed, Husam Jasim [5 ]
Sani, Nor Samsiah [6 ]
Esa, Mohd Isrul [6 ]
Musawi, Bashaer Abbuod [7 ]
机构
[1] Wasit Univ, Coll Educ Pure Sci, Al Kut 52001, Iraq
[2] Univ Kufa, Informat Technol Res & Dev Ctr ITRDC, Najaf 54001, Iraq
[3] Univ Warith Al Anbiyaa, Coll Engn, Karbala 63514, Iraq
[4] Univ Tenaga Nas, Natl Energy Ctr, Selangor 43000, Malaysia
[5] Imam Jaafar Al Sadiq Univ, Coll Adm & Financial Sci, Dept Business Adm, Baghdad 10001, Iraq
[6] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Ctr Artificial Intelligence Technol, Bangi 43600, Malaysia
[7] Univ Kufa, Fac Educ Girls, Dept Biol, Najaf 54001, Iraq
来源
SYMMETRY-BASEL | 2023年 / 15卷 / 01期
关键词
malware detection; deep learning; dense model; feature selection; LSTM;
D O I
10.3390/sym15010123
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Malware is one of the most frequent cyberattacks, with its prevalence growing daily across the network. Malware traffic is always asymmetrical compared to benign traffic, which is always symmetrical. Fortunately, there are many artificial intelligence techniques that can be used to detect malware and distinguish it from normal activities. However, the problem of dealing with large and high-dimensional data has not been addressed enough. In this paper, a high-performance malware detection system using deep learning and feature selection methodologies is introduced. Two different malware datasets are used to detect malware and differentiate it from benign activities. The datasets are preprocessed, and then correlation-based feature selection is applied to produce different feature-selected datasets. The dense and LSTM-based deep learning models are then trained using these different versions of feature-selected datasets. The trained models are then evaluated using many performance metrics (accuracy, precision, recall, and F1-score). The results indicate that some feature-selected scenarios preserve almost the same original dataset performance. The different nature of the used datasets shows different levels of performance changes. For the first dataset, the feature reduction ratios range from 18.18% to 42.42%, with performance degradation of 0.07% to 5.84%, respectively. The second dataset reduction rate is between 81.77% and 93.5%, with performance degradation of 3.79% and 9.44%, respectively.
引用
收藏
页数:21
相关论文
共 42 条
[1]   An Optimal Framework for SDN Based on Deep Neural Network [J].
Abdallah, Abdallah ;
Ishak, Mohamad Khairi ;
Sani, Nor Samsiah ;
Khan, Imran ;
Albogamy, Fahad R. ;
Amano, Hirofumi ;
Mostafa, Samih M. .
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (01) :1125-1140
[2]  
[Anonymous], 2018, N SARAVANA MALWARE D
[3]   Hybrid Android Malware Detection by Combining Supervised and Unsupervised Learning [J].
Arora, Anshul ;
Peddoju, Sateesh K. ;
Chouhan, Vikas ;
Chaudhary, Ajay .
MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, :798-800
[4]  
Assegie T.A, 2021, TSEHAY ADMASSU ASSEG, V8, P2349
[5]  
Baldini G, 2019, INT C CONTROL DECISI, P193, DOI [10.1109/CoDIT.2019.8820510, 10.1109/codit.2019.8820510]
[6]   FAM: Featuring Android Malware for Deep Learning-Based Familial Analysis [J].
Ban, Younghoon ;
Lee, Sunjun ;
Song, Dokyung ;
Cho, Haehyun ;
Yi, Jeong Hyun .
IEEE ACCESS, 2022, 10 :20008-20018
[7]   Automatic Malignant and Benign Skin Cancer Classification Using a Hybrid Deep Learning Approach [J].
Bassel, Atheer ;
Abdulkareem, Amjed Basil ;
Alyasseri, Zaid Abdi Alkareem ;
Sani, Nor Samsiah ;
Mohammed, Husam Jasim .
DIAGNOSTICS, 2022, 12 (10)
[8]   Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web [J].
Castillo-Zuniga, Ivan ;
Javier Luna-Rosas, Francisco ;
Rodriguez-Martinez, Laura C. ;
Munoz-Arteaga, Jaime ;
Ivan Lopez-Veyna, Jaime ;
Rodriguez-Diaz, Mario A. .
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2020, 16 (01) :69-86
[9]   Detecting Cryptomining Malware: a Deep Learning Approach for Static and Dynamic Analysis [J].
Darabian, Hamid ;
Homayounoot, Sajad ;
Dehghantanha, Ali ;
Hashemi, Sattar ;
Karimipour, Hadis ;
Parizi, Reza M. ;
Choo, Kim-Kwang Raymond .
JOURNAL OF GRID COMPUTING, 2020, 18 (02) :293-303
[10]   Android Malware Detection Using Machine Learning [J].
Droos, Ayat ;
Al-Mahadeen, Awss ;
Al-Harasis, Tasnim ;
Al-Attar, Rama ;
Ababneh, Mohammad .
2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, :36-41