Self-Supervised Latent Representations of Network Flows and Application to Darknet Traffic Classification

被引:3
|
作者
Zakroum, Mehdi [1 ,2 ,3 ,4 ]
Francois, Jerome [2 ]
Ghogho, Mounir [1 ]
Chrisment, Isabelle [3 ,4 ]
机构
[1] Int Univ Rabat, TIC Lab, Rabat 111000, Morocco
[2] Inria, F-54600 Nancy, France
[3] Univ Lorraine, F-54052 Nancy, France
[4] LORIA, F-54506 Nancy, France
关键词
Self-supervised learning; unsupervised learning; graph neural networks; graph auto-encoders; anonymous walk embedding; graph embedding; network flows; network probing; network telescope; Darknet; BOTNET;
D O I
10.1109/ACCESS.2023.3263206
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Characterizing network flows is essential for security operators to enhance their awareness about cyber-threats targeting their networks. The automation of network flow characterization with machine learning has received much attention in recent years. To this aim, raw network flows need to be transformed into structured and exploitable data. In this research work, we propose a method to encode raw network flows into robust latent representations exploitable in downstream tasks. First, raw network flows are transformed into graph-structured objects capturing their topological aspects (packet-wise transitional patterns) and features (used protocols, packets' flags, etc.). Then, using self-supervised techniques like Graph Auto-Encoders and Anonymous Walk Embeddings, each network flow graph is encoded into a latent representation that encapsulates both the structure of the graph and the features of its nodes, while minimizing information loss. This results in semantically-rich and robust representation vectors which can be manipulated by machine learning algorithms to perform downstream network-related tasks. To evaluate our network flow embedding models, we use probing flows captured with two /20 network telescopes and labeled using reports originating from different sources. The experimental results show that the proposed network flow embedding approach allows for reliable darknet probing activity classification. Furthermore, a comparison between our self-supervised approach and a fully-supervised graph convolutional network shows that, in situations with limited labeled data, the downstream classification model that uses the derived latent representations as inputs outperforms the fully-supervised graph convolutional network. There are many applications of this research work in cybersecurity, such as network flow clustering, attack detection and prediction, malware detection, vulnerability exploit analysis, and inference of attacker's intentions.
引用
收藏
页码:90749 / 90765
页数:17
相关论文
共 50 条
  • [11] Retinal Image Classification by Self-Supervised Fuzzy Clustering Network
    Luo, Yueguo
    Pan, Jing
    Fan, Shaoshuah
    Du, Zeyu
    Zhang, Guanghua
    IEEE ACCESS, 2020, 8 : 92352 - 92362
  • [12] Self-supervised Network Evolution for Few-shot Classification
    Tang, Xuwen
    Teng, Zhu
    Zhang, Baopeng
    Fan, Jianping
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3045 - 3051
  • [13] Anomaly classification based on self-supervised learning and its application
    Han, Yongsheng
    Qi, Zhiquan
    Tian, Yingjie
    JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES, 2024, 17 (03)
  • [14] Self-Supervised Enhancement of Latent Discovery in GANs
    Kappiyath, Adarsh
    Sreelatha, Silpa Vadakkeeveetil
    Sumitra, S.
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7078 - 7086
  • [15] Self-Supervised Learning for Specified Latent Representation
    Liu, Chicheng
    Song, Libin
    Zhang, Jiwen
    Chen, Ken
    Xu, Jing
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (01) : 47 - 59
  • [16] WATERMARKING IMAGES IN SELF-SUPERVISED LATENT SPACES
    Fernandez, Pierre
    Sablayrolles, Alexandre
    Furon, Teddy
    Jegou, Herve
    Douze, Matthijs
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3054 - 3058
  • [17] Self-Supervised Assisted Semi-Supervised Residual Network for Hyperspectral Image Classification
    Song, Liangliang
    Feng, Zhixi
    Yang, Shuyuan
    Zhang, Xinyu
    Jiao, Licheng
    REMOTE SENSING, 2022, 14 (13)
  • [18] Self-Supervised Traffic Classification: Flow Embedding and Few-Shot Solutions
    Horowicz, Eyal
    Shapira, Tal
    Shavitt, Yuval
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (03): : 3054 - 3067
  • [19] Tabular-based self-supervised learning approach for encrypted traffic classification
    Zheng, Xuan
    Ma, Xiuli
    Jin, Yanliang
    Gu, Dongsheng
    Wang, Rui
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (04)
  • [20] Self-supervised Phonotactic Representations for Language Identification
    Ramesh, G.
    Kumar, C. Shiva
    Murty, K. Sri Rama
    INTERSPEECH 2021, 2021, : 1514 - 1518