Self-Supervised Latent Representations of Network Flows and Application to Darknet Traffic Classification

被引:3
|
作者
Zakroum, Mehdi [1 ,2 ,3 ,4 ]
Francois, Jerome [2 ]
Ghogho, Mounir [1 ]
Chrisment, Isabelle [3 ,4 ]
机构
[1] Int Univ Rabat, TIC Lab, Rabat 111000, Morocco
[2] Inria, F-54600 Nancy, France
[3] Univ Lorraine, F-54052 Nancy, France
[4] LORIA, F-54506 Nancy, France
关键词
Self-supervised learning; unsupervised learning; graph neural networks; graph auto-encoders; anonymous walk embedding; graph embedding; network flows; network probing; network telescope; Darknet; BOTNET;
D O I
10.1109/ACCESS.2023.3263206
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Characterizing network flows is essential for security operators to enhance their awareness about cyber-threats targeting their networks. The automation of network flow characterization with machine learning has received much attention in recent years. To this aim, raw network flows need to be transformed into structured and exploitable data. In this research work, we propose a method to encode raw network flows into robust latent representations exploitable in downstream tasks. First, raw network flows are transformed into graph-structured objects capturing their topological aspects (packet-wise transitional patterns) and features (used protocols, packets' flags, etc.). Then, using self-supervised techniques like Graph Auto-Encoders and Anonymous Walk Embeddings, each network flow graph is encoded into a latent representation that encapsulates both the structure of the graph and the features of its nodes, while minimizing information loss. This results in semantically-rich and robust representation vectors which can be manipulated by machine learning algorithms to perform downstream network-related tasks. To evaluate our network flow embedding models, we use probing flows captured with two /20 network telescopes and labeled using reports originating from different sources. The experimental results show that the proposed network flow embedding approach allows for reliable darknet probing activity classification. Furthermore, a comparison between our self-supervised approach and a fully-supervised graph convolutional network shows that, in situations with limited labeled data, the downstream classification model that uses the derived latent representations as inputs outperforms the fully-supervised graph convolutional network. There are many applications of this research work in cybersecurity, such as network flow clustering, attack detection and prediction, malware detection, vulnerability exploit analysis, and inference of attacker's intentions.
引用
收藏
页码:90749 / 90765
页数:17
相关论文
共 50 条
  • [21] Applying self-supervised learning to network intrusion detection for network flows with graph neural network
    Xu, Renjie
    Wu, Guangwei
    Wang, Weiping
    Gao, Xing
    He, An
    Zhang, Zhengpeng
    COMPUTER NETWORKS, 2024, 248
  • [22] Self-Supervised and Invariant Representations for Wireless Localization
    Salihu, Artan
    Rupp, Markus
    Schwarz, Stefan
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (08) : 8281 - 8296
  • [23] Self-Supervised Learning of Smart Contract Representations
    Yang, Shouliang
    Gu, Xiaodong
    Shen, Beijun
    30TH IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2022), 2022, : 82 - 93
  • [24] Utilizing Self-supervised Representations for MOS Prediction
    Tseng, Wei-Cheng
    Huang, Chien-yu
    Kao, Wei-Tsung
    Lin, Yist Y.
    Lee, Hung-yi
    INTERSPEECH 2021, 2021, : 2781 - 2785
  • [25] SIMILARITY ANALYSIS OF SELF-SUPERVISED SPEECH REPRESENTATIONS
    Chung, Yu-An
    Belinkov, Yonatan
    Glass, James
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3040 - 3044
  • [26] SELF-SUPERVISED SPEAKER VERIFICATION WITH SIMPLE SIAMESE NETWORK AND SELF-SUPERVISED REGULARIZATION
    Sang, Mufan
    Li, Haoqi
    Liu, Fang
    Arnold, Andrew O.
    Wan, Li
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6127 - 6131
  • [27] Self-supervised autoencoders for clustering and classification
    Paraskevi Nousi
    Anastasios Tefas
    Evolving Systems, 2020, 11 : 453 - 466
  • [28] Self-supervised regularization for text classification
    Zhou M.
    Li Z.
    Xie P.
    Transactions of the Association for Computational Linguistics, 2021, 9 : 1147 - 1162
  • [29] Self-supervised autoencoders for clustering and classification
    Nousi, Paraskevi
    Tefas, Anastasios
    EVOLVING SYSTEMS, 2020, 11 (03) : 453 - 466
  • [30] Self-supervised Regularization for Text Classification
    Zhou, Meng
    Li, Zechen
    Xie, Pengtao
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 641 - 656