Deep reinforcement learning for a multi-objective operation in a nuclear power plant

被引:8
作者
Bae, Junyong [1 ]
Kim, Jae Min [1 ]
Lee, Seung Jun [1 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Dept Nucl Engn, 50 UNIST Gil, Ulsan 44919, South Korea
基金
新加坡国家研究基金会;
关键词
Nuclear power plant; Automation; Deep reinforcement learning; Soft actor-critic; Hindsight experience replay; LEVEL;
D O I
10.1016/j.net.2023.06.009
中图分类号
TL [原子能技术]; O571 [原子核物理学];
学科分类号
0827 ; 082701 ;
摘要
Nuclear power plant (NPP) operations with multiple objectives and devices are still performed manually by operators despite the potential for human error. These operations could be automated to reduce the burden on operators; however, classical approaches may not be suitable for these multi-objective tasks. An alternative approach is deep reinforcement learning (DRL), which has been successful in automating various complex tasks and has been applied in automation of certain operations in NPPs. But despite the recent progress, previous studies using DRL for NPP operations have limitations to handle complex multi-objective operations with multiple devices efficiently. This study proposes a novel DRL-based approach that addresses these limitations by employing a continuous action space and straightfor-ward binary rewards supported by the adoption of a soft actor-critic and hindsight experience replay. The feasibility of the proposed approach was evaluated for controlling the pressure and volume of the reactor coolant while heating the coolant during NPP startup. The results show that the proposed approach can train the agent with a proper strategy for effectively achieving multiple objectives through the control of multiple devices. Moreover, hands-on testing results demonstrate that the trained agent is capable of handling untrained objectives, such as cooldown, with substantial success. & COPY; 2023 Korean Nuclear Society, Published by Elsevier Korea LLC. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:3277 / 3290
页数:14
相关论文
共 38 条
[1]  
Abadi M., 2016, arXiv, DOI 10.48550/arXiv.1603.04467
[2]   Deep learning-based procedure compliance check system for nuclear power plant emergency operation [J].
Ahn, Jeeyea ;
Lee, Seung Jun .
NUCLEAR ENGINEERING AND DESIGN, 2020, 370 (370)
[3]  
Andrychowicz M., 2017, Advances in neural information processing systems, P30
[4]  
Bae J., 2020, T KOREAN NUCL SOC
[5]   Limit surface/states searching algorithm with a deep neural network and Monte Carlo dropout for nuclear power plant safety assessment [J].
Bae, Junyong ;
Park, Jong Woo ;
Lee, Seung Jun .
APPLIED SOFT COMPUTING, 2022, 124
[6]   Real-time prediction of nuclear power plant parameter trends following operator actions [J].
Bae, Junyong ;
Kim, Geunhee ;
Lee, Seung Jun .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
[7]   Graph neural network based multiple accident diagnosis in nuclear power plants: Data optimization to represent the system configuration [J].
Chae, Young Ho ;
Lee, Chanyoung ;
Han, Sang Min ;
Seong, Poong Hyun .
NUCLEAR ENGINEERING AND TECHNOLOGY, 2022, 54 (08) :2859-2870
[8]   A methodology for diagnosing FAC induced pipe thinning using accelerometers and deep learning models [J].
Chae, Young Ho ;
Kim, Seung Geun ;
Kim, Hyeonmin ;
Kim, Jung Taek ;
Seong, Poong Hyun .
ANNALS OF NUCLEAR ENERGY, 2020, 143
[9]   A Sensor Fault-Tolerant Accident Diagnosis System [J].
Choi, Jeonghun ;
Lee, Seung Jun .
SENSORS, 2020, 20 (20) :1-17
[10]   Discovering faster matrix multiplication algorithms with reinforcement learning [J].
Fawzi, Alhussein ;
Balog, Matej ;
Huang, Aja ;
Hubert, Thomas ;
Romera-Paredes, Bernardino ;
Barekatain, Mohammadamin ;
Novikov, Alexander ;
Ruiz, Francisco J. R. ;
Schrittwieser, Julian ;
Swirszcz, Grzegorz ;
Silver, David ;
Hassabis, Demis ;
Kohli, Pushmeet .
NATURE, 2022, 610 (7930) :47-+