A reinforcement learning-based approach for online optimal control of self-adaptive real-time systems

被引：3

作者：

Haouari, Bakhta ^{[1
,2
,3
]}

Mzid, Rania ^{[1
,4
]}

Mosbahi, Olfa ^{[3
]}

机构：

[1] Univ Tunis El Manar, ISI, 2 Rue Abourraihan Al Bayrouni, Ariana 2080, Tunisia

[2] Univ Carthage, Ctr Urbain Nord, LISI Lab INSAT, BP 676, Tunis 1080, Tunisia

[3] Univ Carthage, Tunisia Polytech Sch, BP 743, La Marsa 2078, Tunisia

[4] Univ Sfax, CES Lab ENIS, BP W3, Sfax 3038, Tunisia

来源：

NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 27期

关键词：

Embedded real-time systems; Self-adaptation; Scheduling; Online reinforcement learning; Optimal control; Robustness;

D O I：

10.1007/s00521-023-08778-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper deals with self-adaptive real-time embedded systems (RTES). A self-adaptive system can operate in different modes. Each mode encodes a set of real-time tasks. To be executed, each task is allocated to a processor (placement) and assigned a priority (scheduling), while respecting timing constraints. An adaptation scenario allows switching between modes by adding, removing, and updating task parameters that must meet related deadlines after adaptation. For such systems, anticipating all operational modes at design time is usually impossible. Online reinforcement learning is increasingly used in the presence of design-time uncertainty. To tackle this problem, we formalize the placement and scheduling problems in self-adaptive RTES as a Markov decision process and propose related algorithms based on Q-learning. Then, we introduce an approach that integrates the proposed algorithms to assist designers in the development of self-adaptive RTES. At the design level, the RL Placement and the RL Scheduler are proposed to process predictable adaptation scenarios. These modules are designed to generate placement and scheduling models for an application while maximizing system extensibility and ensuring real-time feasibility. At the execution level, the RL Adapter is defined to process online adaptations. Indeed, the goal of the RL Adapter agent is to reject the adaptation scenario when feasibility concerns are raised; otherwise, it generates a new feasible placement and scheduling. We apply and simulate the proposed approach to a healthcare robot case study to show its applicability. Performance evaluations are conducted to prove the effectiveness of the proposed approach compared to related works.

引用

页码：20375 / 20401

页数：27

共 50 条

[21] Deep Reinforcement Learning-Based Dynamic Droop Control Strategy for Real-Time Optimal Operation and Frequency Regulation
Lee, Woon-Gyu
Kim, Hak-Man
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2025, 16 (01) : 284 - 294
[22] Adaptive and Real-time Optimal Control for Adaptive Optics Systems
Doelman, Niek
Fraanje, Rufus
Houtzager, Ivo
Verhaegen, Michel
EUROPEAN JOURNAL OF CONTROL, 2009, 15 (3-4) : 480 - 488
[23] Control Delay in Reinforcement Learning for Real-Time Dynamic Systems: A Memoryless Approach
Schuitema, Erik
Busoniu, Lucian
Babuska, Robert
Jonker, Pieter
IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, : 3226 - 3231
[24] A reinforcement learning-based online learning strategy for real-time short-term load forecasting
Wang, Xinlin
Wang, Hao
Li, Shengping
Jin, Haizhen
ENERGY, 2024, 305
[25] A novel continual reinforcement learning-based expert system for self-optimization of soft real-time systems
Masood, Zafar
Jiangbin, Zheng
Ahmad, Idrees
Dongdong, Chai
Shabbir, Wasif
Irfan, Muhammad
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[26] Real-time operation of distribution network: A deep reinforcement learning-based reconfiguration approach
Bui, Van-Hai
Su, Wencong
SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 50
[27] A self-adaptive SAC-PID control approach based on reinforcement learning for mobile robots
Yu, Xinyi
Fan, Yuehai
Xu, Siyu
Ou, Linlin
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (18) : 9625 - 9643
[28] A Real-Time Self-Adaptive Classifier for Identifying Suspicious Bidders in Online Auctions
Ford, Benjamin J.
Xu, Haiping
Valova, Iren
COMPUTER JOURNAL, 2013, 56 (05): : 646 - 663
[29] Learning-based adaptive optimal control of linear time-delay systems: A value iteration approach
Cui, Leilei
Pang, Bo
Krstic, Miroslav
Jiang, Zhong-Ping
AUTOMATICA, 2025, 171
[30] Learning-Based Adaptive Optimal Control of Linear Time-Delay Systems: A Policy Iteration Approach
Cui, Leilei
Pang, Bo
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (01) : 629 - 636

← 1 2 3 4 5 →