Online Operational Decision-Making for Integrated Electric-Gas Systems With Safe Reinforcement Learning

被引:10
作者
Sayed, Ahmed Rabee [1 ,2 ]
Zhang, Xian [1 ]
Wang, Guibin [3 ]
Qiu, Jing [4 ]
Wang, Cheng [5 ]
机构
[1] Harbin Inst Technol, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
[2] Cairo Univ, Fac Engn, Elect Power & Machines Dept, Giza 12411, Egypt
[3] Shenzhen Univ, Coll Mechatron & Control Engn, Shenzhen 518060, Peoples R China
[4] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
[5] North China Elect Power Univ, Sch Elect & Elect Engn, Beijing 102206, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Safe learning; optimal energy flow; reinforcement learning; fast control; integrated energy systems; soft actor-critic; NATURAL-GAS; OPTIMAL POWER; RELAXATION; NETWORKS;
D O I
10.1109/TPWRS.2023.3320172
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Increasing interdependencies between power and gas systems and integrating large-scale intermittent renewable energy increase the complexity of energy management problems. This article proposes a model-free safe deep reinforcement learning (DRL) approach to find fast optimal energy flow (OEF), guaranteeing its feasibility in real-time operation with high computational efficiency. A constrained Markov decision process model is standardized for the optimization problem of OEF with a limited number of state and control actions and developing a robust integrated environment. Because state-of-the-art DRL algorithms lack safety guarantees, this article develops a soft-constraint enforcement method to adaptively encourage the control policy in the safety direction with non-conservative control actions. The overall procedure, namely the constrained soft actor-critic (C-SAC) algorithm, is off-policy, entropy maximization-based, sample-efficient, and scalable with low hyper-parameter sensitivity. The proposed C-SAC algorithm validates its superiority over the existing learning-based safety ones and OEF solution methods by finding fast OEF decisions with near-zero degrees of constraint violations. The proposed approach indicates its practicability for real-time energy system operation and extensions for other potential applications.
引用
收藏
页码:2893 / 2906
页数:14
相关论文
共 41 条
[1]   Optimal heat and electric power flows in the presence of intermittent renewable source, heat storage and variable grid electricity tariff [J].
Ayele, Getnet Tadesse ;
Mabrouk, Mohamed Tahar ;
Haurant, Pierrick ;
Laumert, Bjorn ;
Lacarriere, Bruno .
ENERGY CONVERSION AND MANAGEMENT, 2021, 243
[2]   Fuelling power plants by natural gas: An analysis of energy efficiency, economical aspects and environmental footprint based on detailed process simulation of the whole carbon capture and storage system [J].
Barbera, Elena ;
Mio, Andrea ;
Pavan, Alessandro Massi ;
Bertucco, Alberto ;
Fermeglia, Maurizio .
ENERGY CONVERSION AND MANAGEMENT, 2022, 252
[3]   Dynamic optimization of natural gas networks under customer demand uncertainties [J].
Behrooz, Hesam Ahmadian ;
Boozarjomehry, R. Bozorgmehry .
ENERGY, 2017, 134 :968-983
[4]   An actor-critic algorithm for constrained Markov decision processes [J].
Borkar, VS .
SYSTEMS & CONTROL LETTERS, 2005, 54 (03) :207-213
[5]   Scalable multi-agent reinforcement learning for distributed control of residential energy flexibility [J].
Charbonnier, Flora ;
Morstyn, Thomas ;
McCulloch, Malcolm D. .
APPLIED ENERGY, 2022, 314
[6]   Optimal Power and Gas How With a Limited Number of Control Actions [J].
Chen, Sheng ;
Wei, Zhinong ;
Sun, Guoqiang ;
Sun, Yongllui ;
Zang, Haixiang ;
Zhu, Ying .
IEEE TRANSACTIONS ON SMART GRID, 2018, 9 (05) :5371-5380
[7]   Reinforcement Learning for Selective Key Applications in Power Systems: Recent Advances and Future Challenges [J].
Chen, Xin ;
Qu, Guannan ;
Tang, Yujie ;
Low, Steven ;
Li, Na .
IEEE TRANSACTIONS ON SMART GRID, 2022, 13 (04) :2935-2958
[8]   Configuration optimization and selection of a photovoltaic-gas integrated energy system considering renewable energy penetration in power grid [J].
Chen, Yuzhu ;
Xu, Jinzhao ;
Wang, Jun ;
Lund, Peter D. ;
Wang, Dengwen .
ENERGY CONVERSION AND MANAGEMENT, 2022, 254
[9]  
Chow Y., 2019, Lyapunov-based safe policy optimization for continuous control
[10]  
Dalal G, 2018, Arxiv, DOI arXiv:1801.08757