Machine learning for encrypted malicious traffic detection: Approaches, datasets and comparative study

被引:55
作者
Wang, Zihao [1 ]
Fok, Kar Wai [1 ]
Thing, Vrizlynn L. L. [1 ]
机构
[1] Cybersecur Strateg Technol Centr ST Engn Singapor, Singapore, Singapore
关键词
encrypted malicious traffic detection; traffic classification; machine learning; deep learning; NEURAL-NETWORKS; CLASSIFICATION; INTERNET;
D O I
10.1016/j.cose.2021.102542
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As people's demand for personal privacy and data security becomes a priority, encrypted traffic has become mainstream in the cyber world. However, traffic encryption is also shielding malicious and illegal traffic introduced by adversaries, from being detected. This is especially so in the post-COVID-19 environment where malicious traffic encryption is growing rapidly. Common security solutions that rely on plain payload content analysis such as deep packet inspection are rendered useless. Thus, machine learning based approaches have be-come an important direction for encrypted malicious traffic detection. In this paper, we formulate a universal framework of machine learning based encrypted malicious traffic detection techniques and provided a systematic review. Furthermore, current research adopts different datasets to train their models due to the lack of well-recognized datasets and feature sets. As a result, their model performance cannot be compared and analyzed reliably. Therefore, in this paper, we analyse, process and combine datasets from 5 different sources to generate a comprehensive and fair dataset to aid future research in this field. On this basis, we also implement and compare 10 encrypted malicious traffic detection algorithms. We then discuss challenges and propose future directions of research. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:22
相关论文
共 96 条
[1]   Deep Learning for Network Traffic Monitoring and Analysis (NTMA): A Survey [J].
Abbasi, Mahmoud ;
Shahraki, Amin ;
Taherkordi, Amir .
COMPUTER COMMUNICATIONS, 2021, 170 :19-41
[2]  
Aceto G., 2019, MIRAGE MOBILE APP TR, DOI DOI 10.1109/CCCS.2019.8888137
[3]   DISTILLER: Encrypted traffic classification via multimodal multitask deep learning [J].
Aceto, Giuseppe ;
Ciuonzo, Domenico ;
Montieri, Antonio ;
Pescape, Antonio .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 183
[4]   Toward effective mobile encrypted traffic classification through deep learning [J].
Aceto, Giuseppe ;
Ciuonzo, Domenico ;
Montieri, Antonio ;
Pescape, Antonio .
NEUROCOMPUTING, 2020, 409 :306-315
[5]   Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges [J].
Aceto, Giuseppe ;
Ciuonzo, Domenico ;
Montieri, Antonio ;
Pescape, Antonio .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2019, 16 (02) :445-458
[6]  
AlAhmadi BA, 2018, PROCEEDINGS OF THE 2018 APWG SYMPOSIUM ON ELECTRONIC CRIME RESEARCH (ECRIME), P79
[7]   Deciphering malware's use of TLS (without decryption) [J].
Anderson, Blake ;
Paul, Subharthi ;
McGrew, David .
JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2018, 14 (03) :195-211
[8]   Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non-Stationarity [J].
Anderson, Blake ;
McGrew, David .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :1723-1732
[9]   Identifying Encrypted Malware Traffic with Contextual Flow Data [J].
Anderson, Blake ;
McGrew, David .
AISEC'16: PROCEEDINGS OF THE 2016 ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, 2016, :35-46
[10]   Improving network anomaly detection via selective flow-based sampling [J].
Androulidakis, G. ;
Papavassiliou, S. .
IET COMMUNICATIONS, 2008, 2 (03) :399-409