A reinforcement learning methodology for a human resource planning problem considering knowle dge-based promotion

被引：10

作者：

Karimi-Majd, Amir-Mohsen ^{[1
]}

Mahootchi, Masoud ^{[1
]}

Zakery, Amir ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Ind Engn & Management Syst Dept, 424 Hafez Ave, Tehran 158754413, Iran

来源：

SIMULATION MODELLING PRACTICE AND THEORY | 2017年 / 79卷

关键词：

Reinforcement learning; Production-inventory control; Human resource planning; Stochastic dynamic programming; Knowledge-intensive; MODEL; DEMAND; FRAMEWORK; NETWORK; WORKERS;

D O I：

10.1016/j.simpat.2015.07.004

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper addresses a combined problem of human resource planning (HRP) and production-inventory control for a high-tech industry, wherein the human resource plays a critical role. The main characteristics of this resource are the levels of "knowledge" and the learning process. The learning occurs during the production process in which a worker can promote to the upper knowledge level. Workers in upper levels have more productivity in the production. The objective is to maximize the expected profit by deciding on the optimal numbers of workers in various knowledge levels to fulfill both production and training requirement. As taking an action affects next periods' decisions, the main problem is to find the optimal hiring policy of non-skilled workers in long-time horizon. Thus, we develop a reinforcement learning (RL) model to obtain the optimal decision for hiring workers under the demand uncertainty. The proposed interval-based policy of our RL model, in which for each state there are multiple choices, makes it more flexible. We also embed some managerial issues such as layoffand overtime-working hours into the model. To evaluate the proposed methodology, stochastic dynamic programming (SDP) and a conservative method implemented in a real case study are used. We study all these methods in terms of four criteria: average obtained profit, average obtained cost, the number of newhired workers, and the standard deviation of hiring policies. The numerical results confirm that our developed method end up with satisfactory results compared to two other approaches. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：87 / 99

页数：13

共 34 条

[1] Staffing decisions for heterogeneous workers with turnover [J].

Ahn, HS ;

Righter, R ;

Shanthikumar, JG .

MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2005, 62 (03) :499-514

[2]

[Anonymous], 2003, A First Course in Stochastic Models

[3] Optimizing daily agent scheduling in a multiskill call center [J].

Avramidis, Athanassios N. ;

Chan, Wyean ;

Gendreau, Michel ;

L'Ecuyer, Pierre ;

Pisacane, Ornella .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2010, 200 (03) :822-832

[4] A network-based approach to the multi-activity combined timetabling and crew scheduling problem: Workforce scheduling for public health policy implementation [J].

Barrera, David ;

Velasco, Nubia ;

Amaya, Ciro-Alberto .

COMPUTERS & INDUSTRIAL ENGINEERING, 2012, 63 (04) :802-812

[5]

Bartholomew DJ., 1991, STAT TECHNIQUES MANP

[6] DYNAMIC PROGRAMMING [J].

BELLMAN, R .

SCIENCE, 1966, 153 (3731) :34-&

[7] A control rule for recruitment planning in engineering consultancy [J].

Bordoloi, Sanjeev K. .

JOURNAL OF PRODUCTIVITY ANALYSIS, 2006, 26 (02) :147-163

[8] Human resource planning in knowledge-intensive operations: A model for learning with stochastic turnover [J].

Bordoloi, SK ;

Matsuo, H .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2001, 130 (01) :169-189

[9]

Bulla D. N., 1987, STRATEG HUM RESOUR P, P145

[10] Simulation-based workforce assignment in a multi-organizational social network for alliance-based software development [J].

Celik, Nurcin ;

Lee, Seungho ;

Mazhari, Esfandyar ;

Son, Young-Jun ;

Lemaire, Robin ;

Provan, Keith G. .

SIMULATION MODELLING PRACTICE AND THEORY, 2011, 19 (10) :2169-2188

← 1 2 3 4 →