Dynamic Malware Detection Using Parameter-Augmented Semantic Chain

被引:1
作者
Zhao, Donghui [1 ]
Wang, Huadong [2 ]
Kou, Liang [1 ]
Li, Zhannan [1 ]
Zhang, Jilin [1 ]
机构
[1] Hangzhou Dianzi Univ, Coll Cyberspace, Hangzhou 310018, Peoples R China
[2] DBAPPSecurity Co Ltd, Hangzhou 310051, Peoples R China
关键词
privacy protection; malware detection; deep learning; feature hashing; API sequences;
D O I
10.3390/electronics12244992
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the rapid development and widespread presence of malware, deep-learning-based malware detection methods have become a pivotal approach used by researchers to protect private data. Behavior-based malware detection is effective, but changes in the running environment and malware evolution can alter API calls used for detection. Most existing methods ignore API call parameters while analyzing them separately, which loses important semantic information. Therefore, considering API call parameters and their combinations can improve behavior-based malware detection. To improve the effectiveness of behavior-based malware detection systems, this paper proposes a novel API feature engineering method. The proposed method employs parameter-augmented semantic chains to improve the system's resilience to unknown parameters and elevate the detection rate. The method entails semantically decomposing the API to derive a behavior semantic chain, which provides an initial representation of the behavior exhibited by samples. To further refine the accuracy of the behavior semantic chain in depicting the behavior, the proposed method integrates the parameters utilized by the API into the aforementioned semantic chain. Furthermore, an information compression technique is employed to minimize the loss of critical actions following truncation of API sequences. Finally, a deep learning model consisting of gated CNN, Bi-LSTM, and an attention mechanism is used to extract semantic features embedded within the API sequences and improve the overall detection accuracy. Additionally, we evaluate the proposed method on a competition dataset Datacon2019. Experiments indicate that the proposed method outperforms baselines employing vocabulary-based methods in both robustness to unknown parameters and detection rate.
引用
收藏
页数:14
相关论文
共 33 条
[1]  
Agrawal R, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P2656, DOI 10.1109/ICASSP.2018.8461583
[2]  
Ahmadi M, 2013, COMPUT FRAUD SECUR, P11, DOI 10.1016/S1361-3723(13)70072-1
[3]   A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence [J].
Amer, Eslam ;
Zelinka, Ivan .
COMPUTERS & SECURITY, 2020, 92
[4]  
[Anonymous], 2010, 2010 INT C BROADBAND, DOI DOI 10.1109/BWCCA.2010.85
[5]  
AV-TEST, 2022, AV-TEST Report
[6]  
Avllazagaj E, 2021, PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, P3487
[7]  
Bilge L., 2012, P 2012 ACM C COMPUTE, P833, DOI DOI 10.1145/2382196.2382284
[8]   Deep learning based Sequential model for malware analysis using Windows exe API Calls [J].
Catak, Ferhat Ozgur ;
Yaz, Ahmet Faruk ;
Elezaj, Ogerta ;
Ahmed, Javed .
PEERJ COMPUTER SCIENCE, 2020,
[9]   CruParamer: Learning on Parameter-Augmented API Sequences for Malware Detection [J].
Chen, Xiaohui ;
Hao, Zhiyu ;
Li, Lun ;
Cui, Lei ;
Zhu, Yiran ;
Ding, Zhenquan ;
Liu, Yongji .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 :788-803
[10]  
Cheng JYC, 2013, INT CONF MACH LEARN, P1678, DOI 10.1109/ICMLC.2013.6890868