Distributed Multiagent Reinforcement Learning With Action Networks for Dynamic Economic Dispatch

被引:8
作者
Hu, Chengfang [1 ]
Wen, Guanghui [2 ]
Wang, Shuai [3 ,4 ]
Fu, Junjie [2 ]
Yu, Wenwu [2 ]
机构
[1] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China
[2] Southeast Univ, Sch Math, Dept Syst Sci, Nanjing 211189, Peoples R China
[3] Beihang Univ, Res Inst Frontier Sci, Beijing 100191, Peoples R China
[4] Beihang Univ, Sch Comp Sci & Engn, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Power demand; Heuristic algorithms; Prediction algorithms; Couplings; Approximation algorithms; Power system stability; Convex functions; Distributed optimization; dynamic economic dispatch; multiagent reinforcement learning (MARL); smart grids; VISIBLE IMAGE FUSION; PERFORMANCE; INFORMATION; ALGORITHM; PROTEIN;
D O I
10.1109/TNNLS.2023.3234049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new class of distributed multiagent reinforcement learning (MARL) algorithm suitable for problems with coupling constraints is proposed in this article to address the dynamic economic dispatch problem (DEDP) in smart grids. Specifically, the assumption made commonly in most existing results on the DEDP that the cost functions are known and/or convex is removed in this article. A distributed projection optimization algorithm is designed for the generation units to find the feasible power outputs satisfying the coupling constraints. By using a quadratic function to approximate the state-action value function of each generation unit, the approximate optimal solution of the original DEDP can be obtained by solving a convex optimization problem. Then, each action network utilizes a neural network (NN) to learn the relationship between the total power demand and the optimal power output of each generation unit, such that the algorithm obtains the generalization ability to predict the optimal power output distribution on an unseen total power demand. Furthermore, an improved experience replay mechanism is introduced into the action networks to improve the stability of the training process. Finally, the effectiveness and robustness of the proposed MARL algorithm are verified by simulation.
引用
收藏
页码:9553 / 9564
页数:12
相关论文
共 36 条
[11]   Distributed Economic Dispatch for Smart Grids With Random Wind Power [J].
Guo, Fanghong ;
Wen, Changyun ;
Mao, Jianfeng ;
Song, Yong-Duan .
IEEE TRANSACTIONS ON SMART GRID, 2016, 7 (03) :1572-1583
[12]   Distributed Power Management for Dynamic Economic Dispatch in the Multimicrogrids Environment [J].
He, Xing ;
Yu, Junzhi ;
Huang, Tingwen ;
Li, Chaojie .
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2019, 27 (04) :1651-1658
[13]   Consensus plus Innovations Approach for Distributed Multiagent Coordination in a Microgrid [J].
Hug, Gabriela ;
Kar, Soummya ;
Wu, Chenye .
IEEE TRANSACTIONS ON SMART GRID, 2015, 6 (04) :1893-1903
[14]   RANDOMIZED PARALLEL ALGORITHMS FOR BACKTRACK SEARCH AND BRANCH-AND-BOUND COMPUTATION [J].
KARP, RM ;
ZHANG, YJ .
JOURNAL OF THE ACM, 1993, 40 (03) :765-789
[15]   Distributed Optimal Consensus Over Resource Allocation Network and Its Application to Dynamical Economic Dispatch [J].
Li, Chaojie ;
Yu, Xinghuo ;
Huang, Tingwen ;
He, Xing .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2407-2418
[16]   Virtual-Action-Based Coordinated Reinforcement Learning for Distributed Economic Dispatch [J].
Li, Dewen ;
Yu, Liying ;
Li, Ning ;
Lewis, Frank .
IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (06) :5143-5152
[17]   Distributed Q-Learning-Based Online Optimization Algorithm for Unit Commitment and Dispatch in Smart Grid [J].
Li, Fangyuan ;
Qin, Jiahu ;
Zheng, Wei Xing .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (09) :4146-4156
[18]   Multiplayer Stackelberg-Nash Game for Nonlinear System via Value Iteration-Based Integral Reinforcement Learning [J].
Li, Man ;
Qin, Jiahu ;
Freris, Nikolaos M. ;
Ho, Daniel W. C. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1429-1440
[19]   Distributed Economic Dispatch in Microgrids Based on Cooperative Reinforcement Learning [J].
Liu, Weirong ;
Zhuang, Peng ;
Liang, Hao ;
Peng, Jun ;
Huang, Zhiwu .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2192-2203
[20]   Consensus problems in networks of agents with switching topology and time-delays [J].
Olfati-Saber, R ;
Murray, RM .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (09) :1520-1533