A model-based reinforcement learning approach for maintenance optimization of degrading systems in a large state space

被引：26

作者：

Zhang, Ping ^{[1
,2
]}

Zhu, Xiaoyan ^{[1
]}

Xie, Min ^{[2
,3
]}

机构：

[1] Univ Chinese Acad Sci, Sch Econ & Management, Bldg 7,80 Zhongguancun East Rd, Beijing, Peoples R China

[2] City Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

[3] City Univ Hong Kong, Sch Data Sci, Hong Kong, Peoples R China

来源：

COMPUTERS & INDUSTRIAL ENGINEERING | 2021年 / 161卷

基金：

中国国家自然科学基金;

关键词：

Maintenance optimization; Periodic inspection; Model-based reinforcement learning; Degrading system; PREDICTIVE MAINTENANCE; DEGRADATION; RELIABILITY; POLICY; ANALYTICS; SUBJECT; PARTS;

D O I：

10.1016/j.cie.2021.107622

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Scheduling maintenance tasks based on the deteriorating process has often been established on degradation models. However, the formulas of the degradation processes are usually unknown and hard to be determined for a system working in practices. In this study, we develop a model-based reinforcement learning approach for maintenance optimization. The developed approach determines maintenance actions for each degradation state at each inspection time over a finite planning horizon, supposing that the degradation formula is known or unknown. At each inspection time, the developed approach attempts to learn an optimal assessment value for each maintenance action to be performed at each degradation state. The assessment value quantifies the goodness of each state-action pair in terms of minimizing the accumulated maintenance costs over the planning horizon. To optimize the assessment values when a well-defined degradation formula is known, we customize a Q-learning method with model-based acceleration. When the degradation formula is unknown or hard to be determined, we develop a Dyna-Q method with maintenance-oriented improvements, in which an environment model capturing the degradation pattern under different maintenance actions is learned at first; Then, the assessment values are optimized while considering the stochastic behavior of the system degradation. The final maintenance policy is acquired by performing the maintenance actions associated with the highest assessment values. Experimental studies are presented to illustrate the applications.

引用

页数：14

共 50 条

[21] Model-based reinforcement learning with model error and its application [J].

Tajima, Yoshiyuki ;

Onisawa, Takehisa .

PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-8, 2007, :1333-1336

[22] Model-based reinforcement learning for approximate optimal regulation [J].

Kamalapurkar, Rushikesh ;

Walters, Patrick ;

Dixon, Warren E. .

AUTOMATICA, 2016, 64 :94-104

[23] Multiple model-based reinforcement learning for nonlinear control [J].

Samejima, K ;

Katagiri, K ;

Doya, K ;

Kawato, M .

ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (09) :54-69

[24] Survey of Model-Based Reinforcement Learning: Applications on Robotics [J].

Athanasios S. Polydoros ;

Lazaros Nalpantidis .

Journal of Intelligent & Robotic Systems, 2017, 86 :153-173

[25] Survey of Model-Based Reinforcement Learning: Applications on Robotics [J].

Polydoros, Athanasios S. ;

Nalpantidis, Lazaros .

JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2017, 86 (02) :153-173

[26] Likelihood Estimator for Multi Model-Based Reinforcement Learning [J].

Albarrans, Guilherme ;

Freire, Valdinei .

INTELLIGENT SYSTEMS, BRACIS 2024, PT II, 2025, 15413 :184-198

[27] A Brief Survey of Model-Based Reinforcement Learning Techniques [J].

Pal, Constantin-Valentin ;

Leon, Florin .

2020 24TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2020, :92-97

[28] Offline Model-Based Reinforcement Learning for Tokamak Control [J].

Char, Ian ;

Abbate, Joseph ;

Bardoczi, Laszlo ;

Boyer, Mark D. ;

Chung, Youngseog ;

Conlin, Rory ;

Erickson, Keith ;

Mehta, Viraj ;

Richner, Nathan ;

Kolemen, Egemen ;

Schneider, Jeff .

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211

[29] Model-Based Reinforcement Learning for Cavity Filter Tuning [J].

Nimara, Doumitrou Daniil ;

Malek-Mohammadi, Mohammadreza ;

Wei, Jieqiang ;

Huang, Vincent ;

Ogren, Petter .

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211

[30] Physics-Informed Model-Based Reinforcement Learning [J].

Ramesh, Adithya ;

Ravindran, Balaraman .

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211

← 1 2 3 4 5 →