Evaluating the stealth of reinforcement learning-based cyber attacks against unknown scenarios using knowledge transfer techniques

被引：0

作者：

Horta Neto, Antonio Jose ^{[1
,2
,3
]}

dos Santos, Anderson Fernandes Pereira ^{[2
,3
]}

Goldschmidt, Ronaldo Ribeiro ^{[1
,2
]}

机构：

[1] Mil Inst Engn IME, Def Engn Grad Program, Rio De Janeiro, RJ, Brazil

[2] IME, Syst & Comp Grad Program, Rio De Janeiro, RJ, Brazil

[3] IME, Cyber Secur Cyber Phys Syst Lab, Rio De Janeiro, RJ, Brazil

来源：

JOURNAL OF COMPUTER SECURITY | 2025年 / 33卷 / 02期

关键词：

Reinforcement learning; transfer learning; imitation learning; knowledge transfer; cyber attacks; unknown scenarios;

D O I：

10.3233/JCS-230145

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Organizations are vulnerable to cyber attacks as they rely on computer networks and the internet for communication and data storage. While Reinforcement Learning (RL) is a widely used strategy to simulate and learn from these attacks, RL-guided offensives against unknown scenarios often lead to early exposure due to low stealth resulting from mistakes during the training phase. To address this issue, this work evaluates if the use of Knowledge Transfer Techniques (KTT), such as Transfer Learning and Imitation Learning, reduces the probability of early exposure by smoothing mistakes during training. This study developed a laboratory platform and a method to compare RL-based cyber attacks using KTT for unknown scenarios. The experiments simulated 2 unknown scenarios using 4 traditional RL algorithms and 4 KTT. In the results, although some algorithms using KTT obtained superior results, they were not so significant for stealth during the initial epochs of training. Nevertheless, experiments also revealed that throughout the entire learning cycle, Trust Region Policy Optimization (TRPO) is a promising algorithm for conducting cyber offensives based on Reinforcement Learning.

引用

页码：100 / 115

页数：16

共 51 条

[1] Hybrid IoT Cyber Range [J].

Balto, Karl Edvard ;

Yamin, Muhammad Mudassar ;

Shalaginov, Andrii ;

Katt, Basel .

SENSORS, 2023, 23 (06)

[2]

Bowman J.C. Maxwell Standen David, 2022, Cyber Operations Research Gym

[3]

Brockman G, 2016, Arxiv, DOI arXiv:1606.01540

[4] GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning [J].

Chen, Jinyin ;

Hu, Shulong ;

Zheng, Haibin ;

Xing, Changyou ;

Zhang, Guomin .

COMPUTERS & SECURITY, 2023, 126

[5]

Cody T, 2022, Arxiv, DOI [arXiv:2206.06934, 10.48550/arXiv.2206.06934, DOI 10.48550/ARXIV.2206.06934]

[6]

Confido A., 2022, AEROSP CONF PROC, P1, DOI [10.1109/AERO53065.2022.9843459, DOI 10.1109/AERO53065.2022.9843459]

[7]

CyberMAR, 2020, Cyber preparedness actions for a holistic approach and awareness raising in the MARitime logistics supply chain D2.1: State of the art cyber range technologies analysis

[8]

Da Silva FL, 2019, J ARTIF INTELL RES, V64, P645

[9]

Fujimoto S., 2021, arXiv

[10] Crown Jewels Analysis using Reinforcement Learning with Attack Graphs [J].

Gangupantulu, Rohit ;

Cody, Tyler ;

Rahman, Abdul ;

Redino, Christopher ;

Clark, Ryan ;

Park, Paul .

2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,

← 1 2 3 4 5 6 →