Self-Supervised Latent Representations of Network Flows and Application to Darknet Traffic Classification

被引:3
|
作者
Zakroum, Mehdi [1 ,2 ,3 ,4 ]
Francois, Jerome [2 ]
Ghogho, Mounir [1 ]
Chrisment, Isabelle [3 ,4 ]
机构
[1] Int Univ Rabat, TIC Lab, Rabat 111000, Morocco
[2] Inria, F-54600 Nancy, France
[3] Univ Lorraine, F-54052 Nancy, France
[4] LORIA, F-54506 Nancy, France
关键词
Self-supervised learning; unsupervised learning; graph neural networks; graph auto-encoders; anonymous walk embedding; graph embedding; network flows; network probing; network telescope; Darknet; BOTNET;
D O I
10.1109/ACCESS.2023.3263206
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Characterizing network flows is essential for security operators to enhance their awareness about cyber-threats targeting their networks. The automation of network flow characterization with machine learning has received much attention in recent years. To this aim, raw network flows need to be transformed into structured and exploitable data. In this research work, we propose a method to encode raw network flows into robust latent representations exploitable in downstream tasks. First, raw network flows are transformed into graph-structured objects capturing their topological aspects (packet-wise transitional patterns) and features (used protocols, packets' flags, etc.). Then, using self-supervised techniques like Graph Auto-Encoders and Anonymous Walk Embeddings, each network flow graph is encoded into a latent representation that encapsulates both the structure of the graph and the features of its nodes, while minimizing information loss. This results in semantically-rich and robust representation vectors which can be manipulated by machine learning algorithms to perform downstream network-related tasks. To evaluate our network flow embedding models, we use probing flows captured with two /20 network telescopes and labeled using reports originating from different sources. The experimental results show that the proposed network flow embedding approach allows for reliable darknet probing activity classification. Furthermore, a comparison between our self-supervised approach and a fully-supervised graph convolutional network shows that, in situations with limited labeled data, the downstream classification model that uses the derived latent representations as inputs outperforms the fully-supervised graph convolutional network. There are many applications of this research work in cybersecurity, such as network flow clustering, attack detection and prediction, malware detection, vulnerability exploit analysis, and inference of attacker's intentions.
引用
收藏
页码:90749 / 90765
页数:17
相关论文
共 50 条
  • [1] Encrypted Network Traffic Classification using Self-supervised Learning
    Towhid, Md Shamim
    Shahriar, Nashid
    PROCEEDINGS OF THE 2022 IEEE 8TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION (NETSOFT 2022): NETWORK SOFTWARIZATION COMING OF AGE: NEW CHALLENGES AND OPPORTUNITIES, 2022, : 366 - 374
  • [2] Self-Supervised Classification Network
    Amrani, Elad
    Karlinsky, Leonid
    Bronstein, Alex
    COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 116 - 132
  • [3] Encrypted Network Traffic Classification in SDN using Self-supervised Learning
    Towhid, Md Shamim
    Shahriar, Nashid
    PROCEEDINGS OF THE 2022 IEEE 8TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION (NETSOFT 2022): NETWORK SOFTWARIZATION COMING OF AGE: NEW CHALLENGES AND OPPORTUNITIES, 2022, : 243 - 245
  • [4] On the robustness of self-supervised representations for multi-view object classification
    Torpey, David
    Klein, Richard
    PATTERN RECOGNITION LETTERS, 2022, 161 : 82 - 89
  • [5] A Self-Supervised Equivariant Refinement Classification Network for Diabetic Retinopathy Classification
    Fan, Jiacheng
    Yang, Tiejun
    Wang, Heng
    Zhang, Huiyao
    Zhang, Wenjie
    Ji, Mingzhu
    Miao, Jianyu
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024,
  • [6] Self-supervised graph representations of WSIs
    Pina, Oscar
    Vilaplana, Veronica
    GEOMETRIC DEEP LEARNING IN MEDICAL IMAGE ANALYSIS, VOL 194, 2022, 194 : 107 - 117
  • [7] A study of the generalizability of self-supervised representations
    Tendle, Atharva
    Hasan, Mohammad Rashedul
    MACHINE LEARNING WITH APPLICATIONS, 2021, 6
  • [8] Self-supervised learning with ensemble representations
    Han, Kyoungmin
    Lee, Minsik
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 143
  • [9] Self-Supervised Learning Malware Traffic Classification Based on Masked Autoencoder
    Xu, Ke
    Zhang, Xixi
    Wang, Yu
    Ohtsuki, Tomoaki
    Adebisi, Bamidele
    Sari, Hikmet
    Gui, Guan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (10): : 17330 - 17340
  • [10] A Novel Self-Supervised Framework Based on Masked Autoencoder for Traffic Classification
    Zhao, Ruijie
    Zhan, Mingwei
    Deng, Xianwen
    Li, Fangqi
    Wang, Yanhao
    Wang, Yijun
    Gui, Guan
    Xue, Zhi
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (03) : 2012 - 2025