Offline Model-Based Reinforcement Learning for Tokamak Control

被引:0
作者
Char, Ian [1 ]
Abbate, Joseph [2 ]
Bardoczi, Laszlo [3 ]
Boyer, Mark D. [2 ]
Chung, Youngseog [1 ]
Conlin, Rory [4 ]
Erickson, Keith [2 ]
Mehta, Viraj [5 ]
Richner, Nathan [6 ]
Kolemen, Egemen [2 ,4 ]
Schneider, Jeff [1 ,5 ]
机构
[1] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
[2] Princeton Plasma Phys Lab, Princeton, NJ USA
[3] Gen Atom, San Diego, CA USA
[4] Princeton Univ, Dept Mech & Aerosp Engn, Princeton, NJ USA
[5] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA USA
[6] Oak Ridge Associated Univ, Oak Ridge, TN USA
来源
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211 | 2023年 / 211卷
基金
美国国家科学基金会;
关键词
Reinforcement Learning; Model-Based Reinforcement Learning; Tokamak Control;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Control for tokamaks, the leading candidate technology for nuclear fusion, is an important pursuit since the realization of nuclear fusion as an energy source would result in virtually unlimited clean energy. However, control of these devices remains a challenging problem due to complex, non-linear dynamics. At the same time, there is promise in learning controllers for difficult problems thanks to recent algorithmic developments in reinforcement learning. Because every run (or shot) of the tokamak is extremely expensive, in this work, we investigated learning a controller from logged data before testing it on a tokamak. In particular, we used 18 years of data from the DIII-D device in order to learn a controller for the neutral beams that targets specified beta(N) (normalized ratio of plasma pressure to magnetic pressure) and rotation quantities. This was done by using the data to first learn a dynamics model, and then by using this model as a simulator to generate experience to train a controller via reinforcement learning. During a control session on DIII-D, we tested both the ability for our dynamics model to design feedforward trajectories and the controller's ability to do feedback control to achieve specified targets. This work marks some of the first steps in doing reinforcement learning for tokamak control through historical data alone.
引用
收藏
页数:16
相关论文
共 64 条
[1]   Data-driven profile prediction for DIII-D [J].
Abbate, J. ;
Conlin, R. ;
Kolemen, E. .
NUCLEAR FUSION, 2021, 61 (04)
[2]  
Abbate Joseph, 2023, Journal of Plasma Physics
[3]  
Amos B., 2021, Learning for Dynamics and Control, P6
[4]  
An G, 2021, ADV NEUR IN
[5]  
Asadi K, 2018, PR MACH LEARN RES, V80
[6]   Achievement of Sustained Net Plasma Heating in a Fusion Experiment with the Optometrist Algorithm [J].
Baltz, E. A. ;
Trask, E. ;
Binderbauer, M. ;
Dikovsky, M. ;
Gota, H. ;
Mendoza, R. ;
Platt, J. C. ;
Riley, P. F. .
SCIENTIFIC REPORTS, 2017, 7
[7]  
Bardóczi L, 2021, PHYS REV LETT, V127, DOI 10.1103/PhysRevLett.127.055002
[8]  
Benton GW, 2021, PR MACH LEARN RES, V139
[9]   Toward active disruption avoidance via real-time estimation of the safe operating region and disruption proximity in tokamaks [J].
Boyer, M. D. ;
Rea, C. ;
Clement, M. .
NUCLEAR FUSION, 2022, 62 (02)
[10]   Prediction of electron density and pressure profile shapes on NSTX-U using neural networks [J].
Boyer, M. D. ;
Chadwick, J. .
NUCLEAR FUSION, 2021, 61 (04)