Automating post-exploitation with deep reinforcement learning

被引:36
作者
Maeda, Ryusei [1 ]
Mimura, Mamoru [1 ]
机构
[1] 1-10-20 Hashirimizu, Yokosuka, Kanagawa, Japan
关键词
Reinforcement learning; Post-exploitation; A2C; Q-Learning; SARSA; Deep reinforcement learning; Lateral movement;
D O I
10.1016/j.cose.2020.102108
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to assess the risk of information systems, it is important to investigate the behavior of the attacker after successful exploitation (post-exploitation). However, the audit requires the experts, and to the best of our knowledge, there are no solutions to automate this process. This paper proposes a method of automating post-exploitation by combining deep reinforcement learning and the PowerShell Empire, which is famous as a post-exploitation framework. Our reinforcement learning agents select one of the PowerShell Empire modules as an action. The state of the agents is defined by 10 parameters such as type of account that was compromised by the agents. In the learning phase, we compared the learning progress of the 3 reinforcement learning models: A2C, QLearning, and SARSA. The result shows that the A2C could gain reward most efficiently. Moreover, the behavior of the trained agents are evaluated in a test domain network. The results show that the trained agent using A2C could obtain the administrative privileges to the domain controller. (C) 2020 The Authors. Published by Elsevier Ltd.
引用
收藏
页数:13
相关论文
共 25 条
  • [1] [Anonymous], 2020, 29 USENIX SEC S USEN
  • [2] Apruzzese G, 2018, 10 INT C CYB CONFL C
  • [3] Machine Learning Cyberattack and Defense Strategies
    Bland, John A.
    Petty, Mikel D.
    Whitaker, Tymaine S.
    Maxwell, Katia P.
    Cantrell, Walter Alan
    [J]. COMPUTERS & SECURITY, 2020, 92
  • [4] Boileau A., 2005, BLACKHAT BRIEFINGS
  • [5] A new hybrid approach for intrusion detection using machine learning methods
    Cavusoglu, Unal
    [J]. APPLIED INTELLIGENCE, 2019, 49 (07) : 2735 - 2761
  • [6] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [7] Detection of Malicious Code Variants Based on Deep Learning
    Cui, Zhihua
    Xue, Fei
    Cai, Xingjuan
    Cao, Yang
    Wang, Gai-ge
    Chen, Jinjun
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (07) : 3187 - 3196
  • [8] Duckwall S., 2013, HELLO MY NAME IS MIC
  • [9] Dunagan J, 2009, P 22 ACM S OP SYST P
  • [10] Adversarial Reinforcement Learning in a Cyber Security Simulation
    Elderman, Richard
    Pater, Leon J. J.
    Thie, Albert S.
    Drugan, Madalina M.
    Wiering, Marco A.
    [J]. ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, : 559 - 566