An Effective Malware Detection Method Using Hybrid Feature Selection and Machine Learning Algorithms

被引:0
作者
Namita Dabas
Prachi Ahlawat
Prabha Sharma
机构
[1] The NorthCap University,School of Engineering and Technology
来源
Arabian Journal for Science and Engineering | 2023年 / 48卷
关键词
Malware detection; API calls; API sequences; Frequent patterns; Feature selection; Machine learning;
D O I
暂无
中图分类号
学科分类号
摘要
With the advent of internet-based technology, there has been a surge in internet-enabled devices. These devices generate massive volumes of meaningful information to accomplish several tasks. Conversely, cyber-criminals leverage this information to perform cyber-attacks. Malware is one of the most prevalent attacks in the cyber threat landscape to fulfil malicious intents of cyber-criminals. Thus, it becomes imperative to detect and prevent these malware attacks precisely to minimize the damage. A number of researchers have proved that API calls can comprehend malware behaviour accurately and can be utilized with machine learning algorithms to effectively detect malware. Therefore, this paper proposes a novel malware detection method for Windows platform based on API calls, feature selection, and machine learning algorithms. It extracts API calls information in three forms: API calls usage, API calls frequency, and API calls sequences to create three feature sets. These feature sets are enriched using TF-IDF technique and combined to create a more extensive and robust feature set, API integrated feature set. A series of experiments were conducted and results showed that API integrated feature set outperformed other feature sets by attaining 99.6% and higher accuracy for all machine learning algorithms. To address the high-dimensionality concern of API integrated feature set, this work applied several feature selection techniques and results showed that we are able to achieve 99.6–99.9% accuracy with only 9% features of API integrated feature set using hybrid feature selection and machine learning algorithms.
引用
收藏
页码:9749 / 9767
页数:18
相关论文
共 199 条
  • [1] Bhati NS(2020)A review on intrusion detection systems and techniques Int. J. Uncertain. Fuzziness Knowl. Based Syst. 28 65-91
  • [2] Khari M(2013)Opcode sequences as representation of executables for data-mining-based unknown malware detection Inf. Sci. 231 64-82
  • [3] García-Díaz V(2021)A Multi-Perspective malware detection approach through behavioral fusion of API call sequence Comput. Secur. 110 102449-410
  • [4] Verdú E(2018)Malware classification using self organising feature maps and machine activity data Comput. Secur. 73 399-147
  • [5] Santos I(2019)Survey of machine learning techniques for malware analysis Comput. Secur. 81 123-668
  • [6] Brezo F(2011)Automatic analysis of malware behavior using machine learning J. Comput. Secur. 19 639-39
  • [7] Ugarte-Pedrero X(2007)Toward automated dynamic malware analysis using cwsandbox IEEE Secur. Priv. 5 32-S87
  • [8] Bringas PG(2019)Maldy: portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports Digit. Investig. 28 S77-334
  • [9] Amer E(2008)An intelligent PE-malware detection system based on association mining J. Comput. Virol. 4 323-233
  • [10] Zelinka I(2020)A dynamic windows malware detection and prediction method based on contextual understanding of api call sequence Comput. Secur. 92 101760-250