Reinforcement learning based maintenance scheduling of flexible multi-machine manufacturing systems with varying interactive degradation

被引：0

作者：

Chen, Jiangxi ^{[1
]}

Zhou, Xiaojun ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Ind Engn, Shanghai 200240, Peoples R China

来源：

RELIABILITY ENGINEERING & SYSTEM SAFETY | 2025年 / 260卷

基金：

中国国家自然科学基金;

关键词：

Flexible manufacturing system; Maintenance scheduling; Interactive degradation; Hidden-Mode Markov Decision Process; Reinforcement learning; Graph neural network; MODEL;

D O I：

10.1016/j.ress.2025.111018

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In flexible multi-machine manufacturing systems, variations in product types dynamically influence machine loads, subsequently affecting the degradation processes of the machines. Moreover, the interactive degradation between the upstream and downstream machines, caused by the product quality deviations, changes with the different production routes for the variable product types. These factors, combined with the uncertain production schedules, present significant challenges for effective maintenance scheduling. To address these challenges, the maintenance scheduling problem is modeled as a Hidden-Mode Markov Decision Process (HM-MDP), where product types are treated as hidden modes that influence machine degradation and the subsequent maintenance decisions. The Interactive Degradation-Aware Proximal Policy Optimization (IDAPPO) reinforcement learning framework is introduced, enhancing the PPO algorithm with Graph Neural Networks (GNNs) to capture interactive degradation among machines and Long Short-Term Memory (LSTM) networks to handle temporal variations in production schedules. An entropy-based exploration strategy further manages the uncertainty of production schedules, enabling IDAPPO to adaptively optimize maintenance actions. Extensive experiments on both small-scale (5-machine) and large-scale (24-machine) systems demonstrate significantly reduced system losses and accelerated convergence of IDAPPO compared to the baseline approaches. These results indicate that IDAPPO provides a scalable and adaptive solution for improving the efficiency and reliability of complex manufacturing environments.

引用

页数：17

共 49 条

[1] Optimal joint maintenance and operation policies to maximise overall systems effectiveness [J].

AlDurgam, Mohammad M. ;

Duffuaa, Salih O. .

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2013, 51 (05) :1319-1330

[2] Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints [J].

Andriotis, C. P. ;

Papakonstantinou, K. G. .

RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 212

[3] Inspection and maintenance planning: an application of semi-Markov decision processes [J].

Berenguer, C ;

Chu, CB ;

Grall, A .

JOURNAL OF INTELLIGENT MANUFACTURING, 1997, 8 (05) :467-476

[4] PERIODIC REPLACEMENT WHEN MINIMAL REPAIR COSTS VARY WITH TIME [J].

BOLAND, PJ .

NAVAL RESEARCH LOGISTICS, 1982, 29 (04) :541-546

[5]

Choi S. P. M., 2001, Sequence learning. Paradigms, algorithms, and applications (Lecture Notes in Artificial Intelligence Vol.1828), P264

[6]

Choi S. P.-M., 2001, PMLR, P49

[7]

Choi SPM, 2000, ADV NEUR IN, V12, P987

[8] Integrated quality, production logistics and maintenance analysis of multi-stage asynchronous manufacturing systems with degrading machines [J].

Colledani, Marcello ;

Tolio, Tullio .

CIRP ANNALS-MANUFACTURING TECHNOLOGY, 2012, 61 (01) :455-458

[9] Deep reinforcement learning for optimal planning of assembly line maintenance [J].

Geurtsen, M. ;

Adan, I. ;

Atan, Z. .

JOURNAL OF MANUFACTURING SYSTEMS, 2023, 69 :170-188

[10] A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients [J].

Grondman, Ivo ;

Busoniu, Lucian ;

Lopes, Gabriel A. D. ;

Babuska, Robert .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06) :1291-1307

← 1 2 3 4 5 →