Improving malware detection using multi-view ensemble learning

被引:33
作者
Bai, Jinrong [1 ]
Wang, Junfeng [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
malware detection; multi-view feature; ensemble learning; fusion;
D O I
10.1002/sec.1600
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The huge influx of new malware is created every day, and those malware have not been previously seen in the wild. Current anti-virus software uses byte signature to identify known malware and has little hope of identifying new malware. Researchers have proposed several malware detection methods based on byte n-grams, opcode n-grams, and format information, and those methods partially capture the distinguishable information between benign and malicious programs. In this study, we design two schemes to incorporate the aforementioned three single-view features and fully exploit complementary information of those features to discover the true nature of a program. Two datasets are used to evaluate new malware detection performance and generalization performance of the proposed schemes. Experimental results indicate that the proposed schemes increase the detection rate of new malware, improve the generalization performance of learning model, and reduce the false alarm rate to 0%. Because malware is hard to disguise itself in every feature view, the proposed schemes are more robust and not easy to be deceived. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页码:4227 / 4241
页数:15
相关论文
共 32 条
[1]  
Bai J, 2014, SCI WORLD J, V2014, P435
[2]   Opcodes as predictor for malware [J].
Bilar, Daniel .
INTERNATIONAL JOURNAL OF ELECTRONIC SECURITY AND DIGITAL FORENSICS, 2007, 1 (02) :156-168
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]  
Caruana R., 2004, P 21 INT C MACH LEAR, P18, DOI DOI 10.1145/1015330.1015432
[6]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[7]  
Guo SQ, 2010, LECT NOTES COMPUT SC, V6444, P259, DOI 10.1007/978-3-642-17534-3_32
[8]  
Heavens V, 2015, VX HEAVENS SITE
[9]  
Ho TK, 1998, IEEE T PATTERN ANAL, V20, P832, DOI 10.1109/34.709601
[10]  
Jyoti Landage, 2014, COMPUSOFT, V3, P450