Sequence-based malware detection using a single-bidirectional graph embedding and multi-task learning framework

被引:0
作者
Luo, Jiale [1 ]
Zhang, Zhewngyu [1 ]
Luo, Jiesi [2 ]
Yang, Pin [1 ]
Jing, Runyu [1 ]
机构
[1] Sichuan Univ, Sch Cyber Sci & Engn, Chengdu, Sichuan, Peoples R China
[2] Southwest Med Univ, Basic Med Coll, Luzhou, Sichuan, Peoples R China
关键词
Graph embedding; long short-term memory; malware detection; multi-task learning; DYNAMIC-ANALYSIS; CLASSIFICATION;
D O I
10.3233/JCS-230041
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an important part of malware detection and classification, sequence-based analysis can be integrated into dynamic detection system for real-time detection. This work presents a novel learning method for malware detection models that leverages advances in graph embedding for fusing the n-gram data into a one-hot feature space with different transmission directions. By capturing the information flow, our method finds a better feature representation for detection tasks with rely solely on sequence information. To enhance the stability of feature representation, this work adopts a multi-task learning strategy which achieves better performance in independent testing. We evaluate our method on two different realworld datasets and compare it against four superior malware detection models. During malware detection using our method, we conducted in-depth discussions on feature length, graph embedding direction, model depth, and different multi-task learning strategies. Experimental and discussion results show that our method significantly outperforms alternative approaches across evaluation settings.
引用
收藏
页码:141 / 163
页数:23
相关论文
共 44 条
[1]   A system call refinement-based enhanced Minimum Redundancy Maximum Relevance method for ransomware early detection [J].
Ahmed, Yahye Abukar ;
Kocer, Baris ;
Huda, Shamsul ;
Al-rimy, Bander Ali Saleh ;
Hassan, Mohammad Mehedi .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 167
[2]   Two-Stage Hybrid Malware Detection Using Deep Learning [J].
Baek, Seungyeon ;
Jeon, Jueun ;
Jeong, Byeonghui ;
Jeong, Young-Sik .
HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2021, 11
[3]  
Belghazi MI, 2018, PR MACH LEARN RES, V80
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   NF-GNN: Network Flow Graph Neural Networks for Malware Detection and Classification [J].
Busch, Julian ;
Kocheturov, Anton ;
Tresp, Volker ;
Seidl, Thomas .
33RD INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2021), 2020, :121-132
[6]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[7]   Deep learning based Sequential model for malware analysis using Windows exe API Calls [J].
Catak, Ferhat Ozgur ;
Yaz, Ahmet Faruk ;
Elezaj, Ogerta ;
Ahmed, Javed .
PEERJ COMPUTER SCIENCE, 2020,
[8]   Malware detection based on deep learning algorithm [J].
Ding Yuxin ;
Zhu Siyi .
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (02) :461-472
[9]   A malware detection method based on family behavior graph [J].
Ding, Yuxin ;
Xia, Xiaoling ;
Chen, Sheng ;
Li, Ye .
COMPUTERS & SECURITY, 2018, 73 :73-86
[10]   Energy Consumption Metrics for Mobile Device Dynamic Malware Detection [J].
Fasano, Fausto ;
Martinelli, Fabio ;
Mercaldo, Francesco ;
Santone, Antonella .
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 :1045-1052