Identifying Encrypted Malware Traffic with Contextual Flow Data

被引:142
作者
Anderson, Blake [1 ]
McGrew, David [1 ]
机构
[1] Cisco, San Jose, CA 95134 USA
来源
AISEC'16: PROCEEDINGS OF THE 2016 ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY | 2016年
关键词
Encryption; Malware; Machine Learning; Transport Layer Security; Network Monitoring;
D O I
10.1145/2996758.2996768
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying threats contained within encrypted network traffic poses a unique set of challenges. It is important to monitor this traffic for threats and malware, but do so in a way that maintains the integrity of the encryption. Because pattern matching cannot operate on encrypted data, previous approaches have leveraged observable metadata gathered from the flow, e.g., the flow's packet lengths and inter-arrival times. In this work, we extend the current state-of-the-art by considering a data omnia approach. To this end, we develop supervised machine learning models that take advantage of a unique and diverse set of network flow data features. These data features include TLS handshake meta data, DNS contextual flows linked to the encrypted flow, and the HTTP headers of HTTP contextual flows from the same source IP address within a 5 minute window. We begin by exhibiting the differences between malicious and benign traffic's use of TLS, DNS, and HTTP on millions of unique flows. This study is used to design the feature sets that have the most discriminatory power. We then show that incorporating this contextual information into a supervised learning system significantly increases performance at a 0.00% false discovery rate for the problem of classifying encrypted, malicious flows. We further validate our false positive rate on an independent, real-world dataset.
引用
收藏
页码:35 / 46
页数:12
相关论文
共 37 条
  • [21] McGrew D., 2016, ICNP WORKSH MACH LEA
  • [22] Mockapetris P., 1987, RFC 1034, DOI DOI 10.17487/RFC1034
  • [23] Towards Automatic and Lightweight Detection and Classification of Malicious Web Contents
    Mohaisen, Aziz
    [J]. 2015 THIRD IEEE WORKSHOP ON HOT TOPICS IN WEB SYSTEMS AND TECHNOLOGIES (HOTWEB), 2015, : 67 - 72
  • [24] Moore A. W., 2005, Performance Evaluation Review, V33, P50, DOI 10.1145/1071690.1064220
  • [25] Nagaraja S, 2010, USENIX SEC S, V10, P95
  • [26] Nielsen Henrik, 1999, RFC 2616, DOI DOI 10.17487/RFC2616
  • [27] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
  • [28] Saint-Andre P., 2011, 6125 RFC
  • [29] Outside the Closed World: On Using Machine Learning For Network Intrusion Detection
    Sommer, Robin
    Paxson, Vern
    [J]. 2010 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, 2010, : 305 - 316
  • [30] Strayer WT, 2006, C LOCAL COMPUT NETW, P195