REINFORCEMENT LEARNING BASED REAL-TIME CONTROL POLICY FOR TWO-MACHINE-ONE-BUFFER PRODUCTION SYSTEM

被引:0
作者
Zheng, Wei [1 ]
Lei, Yong [1 ]
Chang, Qing [2 ]
机构
[1] Zhejiang Univ, State Key Lab Fluid Power Transmiss & Control, Hangzhou 310027, Peoples R China
[2] SUNY Stony Brook, Dept Mech Engn, Stony Brook, NY 11794 USA
来源
PROCEEDINGS OF THE ASME 12TH INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE - 2017, VOL 3 | 2017年
基金
中国国家自然科学基金;
关键词
PREVENTIVE MAINTENANCE; MANUFACTURING SYSTEM; DECOMPOSITION METHOD; EFFICIENT; LINES; FLOW;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
It is attractive to reduce the total cost of a manufacture system with real-time control of the production. The total cost mainly consists of the production cost, the penalty of the permanent production loss, and the Work-In-Process(WIP) inventory level cost. However, it is difficult to derive an analytical model of manufacture system due to the complexity of starved and blocked phenomena, the random failure and maintenance processes. Therefore, finding a real-time control policy for the manufacture system without exact analytical model is dearly needed. In this paper, a novel reinforcement learning based control decision policy is proposed based on the action of switching the machines on or off at the start of each time slot. Firstly, a simulation model is developed with MT BF and MTTR evaluated from the history data to collect samples. Then, a reinforcement learning method, specifically, Least-Square-Policy-Iteration method, is applied to obtain a sub-optimal policy. The simulation results show that the proposed method performs well in reducing the total cost.
引用
收藏
页数:9
相关论文
共 17 条
[1]   OPTIMAL-CONTROL OF PRODUCTION-RATE IN A FAILURE PRONE MANUFACTURING SYSTEM [J].
AKELLA, R ;
KUMAR, PR .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1986, 31 (02) :116-126
[2]  
[Anonymous], 1998, INTRO REINFORCEMENT
[3]  
[Anonymous], 2003, J. Mach. Learn. Res.
[4]   MANUFACTURING FLOW-CONTROL AND PREVENTIVE MAINTENANCE - A STOCHASTIC-CONTROL APPROACH [J].
BOUKAS, EK ;
HAURIE, A .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1990, 35 (09) :1024-1031
[5]  
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
[6]   Transient Analysis of Downtimes and Bottleneck Dynamics in Serial Manufacturing Systems [J].
Chang, Qing ;
Biller, Stephan ;
Xiao, Guoxian .
JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING-TRANSACTIONS OF THE ASME, 2010, 132 (05)
[7]   An improved decomposition method for the analysis of production lines with unreliable machines and finite buffers [J].
Dallery, Y ;
Le Bihan, H .
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 1999, 37 (05) :1093-1117
[8]   AN EFFICIENT DECOMPOSITION METHOD FOR THE APPROXIMATE EVALUATION OF TANDEM QUEUES WITH FINITE STORAGE SPACE AND BLOCKING [J].
GERSHWIN, SB .
OPERATIONS RESEARCH, 1987, 35 (02) :291-305
[9]   Production and preventive maintenance rates control for a manufacturing system: An experimental design approach [J].
Gharbi, A ;
Kenne, JP .
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2000, 65 (03) :275-287
[10]   A new decomposition approach for non-cyclic continuous material flow lines with a merging flow of material [J].
Helber, S ;
Jusic, H .
ANNALS OF OPERATIONS RESEARCH, 2004, 125 (1-4) :117-139