Static PE Malware Detection Using Gradient Boosting Decision Trees Algorithm

被引:17
作者
Huu-Danh Pham [1 ]
Tuan Dinh Le [2 ]
Thanh Nguyen Vu [1 ]
机构
[1] Vietnam Natl Univ Ho Chi Minh City, Univ Informat Technol, Ho Chi Minh City, Vietnam
[2] Long An Univ Econ & Ind, Tan An, Long An Provinc, Vietnam
来源
FUTURE DATA AND SECURITY ENGINEERING, FDSE 2018 | 2018年 / 11251卷
关键词
Malware detection; Machine learning; PE file format; Gradient boosting decision trees; EMBER dataset;
D O I
10.1007/978-3-030-03192-3_17
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Static malware detection is an essential layer in a security suite, which attempts to classify samples as malicious or benign before execution. However, most of the related works incur the scalability issues, for examples, methods using neural networks usually take a lot of training time [13], or use imbalanced datasets [17, 20], which makes validation metrics misleading in reality. In this study, we apply a static malware detection method by Portable Executable analysis and Gradient Boosting Decision Tree algorithm. We manage to reduce the training time by appropriately reducing the feature dimension. The experiment results show that our proposed method can achieve up to 99.394% detection rate at 1% false alarm rate, and score results in less than 0.1% false alarm rate at a detection rate 97.572%, based on more than 600,000 training and 200,000 testing samples from Endgame Malware BEnchmark for Research (EMBER) dataset [1].
引用
收藏
页码:228 / 236
页数:9
相关论文
共 21 条
[21]   A combination of negative selection algorithm and Artificial Immune Network for virus detection [J].
Nguyen, Vu Thanh ;
Nguyen, Toan Tan ;
Mai, Khang Trong ;
Le, Tuan Dinh .
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8860 :97-106