Minimizing Malware Propagation in Internet of Things Networks: An Optimal Control Using Feedback Loop Approach

被引:6
作者
Tayseer Jafar, Mousa [1 ]
Yang, Lu-Xing [1 ]
Li, Gang [1 ]
Zhu, Qingyi [2 ]
Gan, Chenquan [2 ]
机构
[1] Deakin Univ, Sch Informat Technol, Melbourne, Vic 3125, Australia
[2] Chongqing Univ Posts & Telecommun, Sch Cyber Secur & Informat Law, Chongqing 400065, Peoples R China
关键词
Malware; Internet of Things; Optimal control; Prevention and mitigation; Resource management; Epidemics; Costs; hybrid framework; feedback loop; closed-loop; model predictive control; reinforcement learning; IoT; VIRUS PROPAGATION; DYNAMICAL ANALYSIS; COMPUTER VIRUS; MODEL; INFORMATION; STABILITY; EPIDEMICS;
D O I
10.1109/TIFS.2024.3463965
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Despite extensive research on optimal control formulations for cyber threat mitigation, a significant gap persists between theoretical and practical implementation in real-time scenarios. The open-loop structure of the optimal control framework is insufficiently robust for effectively addressing cyber threats. To overcome this, adopting a model learning process that iteratively updates the optimal control strategy is proposed. This paper proposes an innovative approach to addressing cybersecurity attacks in the Internet of Things (IoT) networks by integrating reinforcement learning (RL) and model predictive control (MPC) in a hybrid framework to optimize control parameters and enhance system effectiveness in combating malware. This novel approach aims to overcome the limitations of the previous approaches and establish superior control strategies for IoT network security. This approach enhances the adaptability and responsiveness of the mitigation process, improving the handling of evolving cyber threats in real-world applications. This framework enhances the security and resilience of IoT networks against malicious activities, offering a robust solution for mitigating cyber threats by leveraging RL algorithms and the proactive capabilities of MPC. A comprehensive evaluation demonstrates the effectiveness and efficiency of the hybrid framework, highlighting its potential to protect IoT networks from evolving cybersecurity risks. The primary aim extends beyond using an RL agent solely for computing control actions to optimize closed-loop performance and stability. It also leverages RL to estimate model parameters that are currently unknown but within known bounds. Our main objective in using the RL agent is to accurately estimate unidentified model parameters within specified limits. The simulation results provide compelling evidence supporting the effectiveness of this methodology in mitigating malware propagation, highlighting its superior performance compared to state-of-the-art methods. RLMPC rapidly initiated recovery, achieving full network restoration in 8 seconds and recovering 60 IoT devices. Also, the evaluation focused on average speed, scalability, and performance under various cyber-attack scenarios.
引用
收藏
页码:9682 / 9697
页数:16
相关论文
共 74 条
[1]   A Mathematical Modeling of Stuxnet-Style Autonomous Vehicle Malware [J].
Ahn, Haesung ;
Choi, Juyeong ;
Kim, Yong Hoon .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) :673-683
[2]  
Akhtar T, 2018, IEEE ICCE
[3]   A reinforcement learning-based economic model predictive control framework for autonomous operation of chemical reactors [J].
Alhazmi, Khalid ;
Albalawi, Fahad ;
Sarathy, S. Mani .
CHEMICAL ENGINEERING JOURNAL, 2022, 428
[4]  
Anderson B. D., 2007, Optimal control:linear quadratic methods
[5]  
[Anonymous], 2023, Threat intelligence report finds malicious IoT botnet based DDOS
[6]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[7]   Autonomous navigation at unsignalized intersections: A coupled reinforcement learning and model predictive control approach [J].
Bautista-Montesano, Rolando ;
Galluzzi, Renato ;
Ruan, Kangrui ;
Fu, Yongjie ;
Di, Xuan .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2022, 139
[8]  
Camacho E.F., 2004, ADV TK CONT SIGN PRO, VSecond
[9]   Model-predictive control and reinforcement learning in multi-energy system case studies [J].
Ceusters, Glenn ;
Rodriguez, Roman Cantu ;
Garcia, Alberte Bouso ;
Franke, Rudiger ;
Deconinck, Geert ;
Helsen, Lieve ;
Nowe, Ann ;
Messagie, Maarten ;
Camargo, Luis Ramirez .
APPLIED ENERGY, 2021, 303
[10]   Spatial-temporal modeling of malware propagation in networks [J].
Chen, ZS ;
Ji, CY .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (05) :1291-1303