Optimal preventive maintenance policy based on reinforcement learning of a fleet of military trucks

被引:32
作者
Barde, Stephane R. A. [1 ]
Yacout, Soumaya [2 ]
Shin, Hayong [1 ]
机构
[1] Korea Adv Inst Sci & Technol, 291 Daehak Ro, Daejeon 34141, South Korea
[2] Ecole Polytech Montreal, Montreal, PQ, Canada
关键词
Preventive maintenance; Opportunistic maintenance; Markov decision process; Reinforcement learning;
D O I
10.1007/s10845-016-1237-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we model preventive maintenance strategies for equipment composed of multi-non-identical components which have different time-to-failure probability distribution, by using a Markov decision process (MDP). The originality of this paper resides in the fact that a Monte Carlo reinforcement learning (MCRL) approach is used to find the optimal policy for each different strategy. The approach is applied to an already existing published application which deals with a fleet of military trucks. The fleet consists of a group of similar trucks that are composed of non-identical components. The problem is formulated as a MDP and solved by a MCRL technique. The advantage of this modeling technique when compared to the published one is that there is no need to estimate the main parameters of the model, for example the estimation of the transition probabilities. These parameters are treated as variables and they are found by the modeling technique, while searching for the optimal solution. Moreover, the technique is not bounded by any explicit mathematical formula, and it converges to the optimal solution whereas the previous model optimizes the replacement policy of each component separately, which leads to a local optimization. The results show that by using the reinforcement learning approach, we are able of getting a 36.44% better solution that is less downtime.
引用
收藏
页码:147 / 161
页数:15
相关论文
共 15 条
  • [1] Abdel Haleem B., 1998, Quality Engineering, V11, P303, DOI 10.1080/08982119808919242
  • [2] [Anonymous], 2010, SYNTHESIS LECT ARTIF
  • [3] Optimal preventive maintenance in a production inventory system
    Das, TK
    Sarkar, S
    [J]. IIE TRANSACTIONS, 1999, 31 (06) : 537 - 551
  • [4] The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions
    Gelly, Sylvain
    Kocsis, Levente
    Schoenauer, Marc
    Sebag, Michele
    Silver, David
    Szepesvari, Csaba
    Teytaud, Olivier
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (03) : 106 - 113
  • [5] Reinforcement learning for long-run average cost
    Gosavi, A
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2004, 155 (03) : 654 - 674
  • [6] Jardine A.K., 2013, MAINTENANCE REPLACEM
  • [7] A Structural Property of Optimal Policies for Multi-Component Maintenance Problems
    Jia, Qing-Shan
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2010, 7 (03) : 677 - 680
  • [8] Powell W., 2007, APPROXIMATE DYNAMIC
  • [9] Steven B, 2001, MAINTENANCE EXCELLEN, P43
  • [10] Sutton R. S., 1998, Reinforcement Learning: An Introduction, V2