Single/Multi Agent Simplified Deep Reinforcement Learning Based Volt-Var Control of Power System

被引：0

作者：

Ma Q. ^{[1
]}

Deng C. ^{[1
]}

机构：

[1] School of Electrical Engineering and Automation, Wuhan University, Wuhan

来源：

Deng, Changhong (dengch@whu.edu.cn) | 2024年 / 39卷 / 05期

关键词：

centralized control; decentralized control; multi-agent simplified deep reinforcement learning; single-agent simplified deep reinforcement learning; Volt-var control;

D O I：

10.19595/j.cnki.1000-6753.tces.222195

中图分类号：

学科分类号：

摘要：

In order to quickly suppress the rapid fluctuations of reactive power and voltage caused by the random output change of distributed energies, machine learning (ML) methods represented by deep reinforcement learning (DRL) and imitation learning (IL) have been applied to volt-var control (VVC) research recently, to replace the traditional methods which require a large number of iterations. Although the ML methods in the existing literature can realize the online rapid VVC optimization, there are still some shortcomings such as slow offline training speed and insufficient universality that hinder their applications in practice. Firstly, this paper proposes a single-agent simplified DRL (SASDRL) method suitable for the centralized control of transmission networks. Based on the classic "Actor-Critic" architecture and the fact that the Actor network can generate wonderful control strategies heavily depends on whether the Critic network can make accurate evaluation, this method simplifies and improves the offline training process of DRL based VVC, whose core ideas are the simplification of Critic network training and the change in the update mode of Actor and Critic network. It simplifies the sequential decision problem set in the traditional DRL based VVC to a single point decision problem and the output of Critic network is transformed from the original sequential action value into the reward value corresponding to the current control strategy. In addition, by training the Critic network in advance to help the accelerated convergence of Actor network, it solves the computational waste problem caused by the random search of agent in the early training stage which greatly improves the offline training speed, and retains the DRL’s advantages like without using massive labeled data and strong universality. Secondly, a multi-agent simplified DRL method (MASDRL) suitable for decentralized and zero-communication control of active distribution network is proposed. This method generalizes the core idea of SASDRL to form a multi-agent version and continues to accelerate the convergence performance of Actor network of each agent on the basis of training the unified Critic network in advance. Each agent corresponds to a different VVC device in the system. During online application, each agent only uses the local information of the node connected to the VVC device to generate the control strategy through its own Actor network independently. Besides, it adopts IL for initialization to inject the global optimization idea into each agent in advance, and improves the local collaborative control effect between various VVC devices. Simulation results on the improved IEEE 118-bus system show that SASDRL and MASDRL both achieve the best control results of VVC among all the compared methods. In terms of offline training speed, SASDRL consumes the least amount of training time, whose speed is 4.47 times faster than the traditional DRL and 50.76 times faster than IL. 87.1% of SASDRL's training time is spent on generating the expert samples required for the supervised training of Critic network while only 12.9% is consumed by the training of Actor and Critic network. Regarding MASDRL, it can realize the 82.77% reduction in offline training time compared to traditional MADRL. The following conclusions can be drawn from the simulation analysis: (1) Compared with traditional mathematical methods and existing ML methods, SASDRL is able to obtain excellent control results similar to mathematical methods while greatly accelerating the offline training speed of DRL based VVC. (2) Compared with traditional MADRL, by the inheritance of SASDRL’ core ideas and the introduction of IL into the initialization of Actor network, the method of MASDRL+IL proposed can improve the local collaborative control effect between various VVC devices and offline training speed significantly. © 2024 China Machine Press. All rights reserved.

引用

页码：1300 / 1312

页数：12

共 40 条

[1] Mahmud N, Zahedi A., Review of control strategies for voltage regulation of the smart distribution network with high penetration of renewable distributed generation, Renewable and Sustainable Energy Reviews, 64, pp. 582-595, (2016)
[2] Gao Congzhe, Huang Wentao, Yu Moduo, Et al., A model predictive control method to optimize voltages for active distribution networks with soft open point, Transactions of China Electrotechnical Society, 37, 13, pp. 3263-3274, (2022)
[3] Kang Chongqing, Yao Liangzhong, Key scientific issues and theoretical research framework for power systems with high proportion of renewable energy, Automation of Electric Power Systems, 41, 9, pp. 2-11, (2017)
[4] Yao Liangzhong, Zhu Lingzhi, Zhou Ming, Et al., Prospects of coordination and optimization for power systems with high proportion of renewable energy, Automation of Electric Power Systems, 41, 9, pp. 36-43, (2017)
[5] Guo Qinglai, Wang Bin, Sun Hongbin, Et al., Autonomous-synergic voltage control technology supporting large-scale wind power integration, Automation of Electric Power Systems, 39, 1, pp. 88-93, (2015)
[6] Wang Gang, Kekatos V, Conejo A J, Et al., Ergodic energy management leveraging resource variability in distribution grids, IEEE Transactions on Power Systems, 31, 6, pp. 4765-4775, (2016)
[7] Chen Jianglan, Tang Weidong, Xiao Xiaogang, Et al., Coordinated voltage control for Central China Power Grid, Electric Power Automation Equipment, 31, 8, pp. 47-51, (2011)
[8] Xu Fengda, Guo Qinglai, Sun Hongbin, Et al., Automatic voltage control of wind farms based on model predictive control theory, Automation of Electric Power Systems, 39, 7, pp. 59-67, (2015)
[9] (2019)
[10] Liu Haotian, Wu Wenchuan, Two-stage deep reinforcement learning for inverter-based volt-VAR control in active distribution networks, IEEE Transactions on Smart Grid, 12, 3, pp. 2037-2047, (2021)

← 1 2 3 4 →