FlowTransformer: A transformer framework for flow-based network intrusion detection systems

被引：44

作者：

Manocchio, Liam Daly ^{[1
]}

Layeghy, Siamak ^{[1
]}

Lo, Wai Weng ^{[1
]}

Kulatilleke, Gayan K. ^{[1
]}

Sarhan, Mohanad ^{[1
]}

Portmann, Marius ^{[1
]}

机构：

[1] Univ Queensland, Sch ITEE, Brisbane, Australia

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 241卷

关键词：

Transformers; Network intrusion detection system (NIDS); Machine learning (ML); Generative pre-trained transformer (GPT); Network flow;

D O I：

10.1016/j.eswa.2023.122564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents the FlowTransformer framework, a novel approach for implementing transformer-based Network Intrusion Detection Systems (NIDSs). FlowTransformer leverages the strengths of transformer models in identifying the long-term behaviour and characteristics of networks, which are often overlooked by most existing NIDSs. By capturing these complex patterns in network traffic, FlowTransformer offers a flexible and efficient tool for researchers and practitioners in the cybersecurity community who are seeking to implement NIDSs using transformer-based models. FlowTransformer allows the direct substitution of various transformer components, including the input encoding, transformer, classification head, and the evaluation of these across any flow-based network dataset. To demonstrate the effectiveness and efficiency of the FlowTransformer framework, we utilise it to provide an extensive evaluation of various common transformer architectures, such as GPT 2.0 and BERT, on three commonly used public NIDS benchmark datasets. We provide results for accuracy, model size and speed. A key finding of our evaluation is that the choice of classification head has the most significant impact on the model performance. Surprisingly, Global Average Pooling, which is commonly used in text classification, performs very poorly in the context of NIDS. In addition, we show that model size can be reduced by over 50%, and inference and training times improved, with no loss of accuracy, by making specific choices of input encoding and classification head instead of other commonly used alternatives.

引用

页数：15

共 45 条

[1]

Aitken P., 2013, RFC7011 IETF

[2]

Delgadillo K., 1996, Cisco Whitepaper.

[3]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[4] Network intrusion detection based on n-gram frequency and time-aware transformer [J].