Autonomous Input Voltage Sharing Control and Triple Phase Shift Modulation Method for ISOP-DAB Converter in DC Microgrid: A Multiagent Deep Reinforcement Learning-Based Method

被引：35

作者：

Zeng, Yu ^{[1
]}

Pou, Josep ^{[1
]}

Sun, Changjiang ^{[2
]}

Mukherjee, Suvajit ^{[3
]}

Xu, Xu ^{[2
,4
]}

Gupta, Amit Kumar ^{[3
]}

Dong, Jiaxin ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

[2] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore 639798, Singapore

[3] Rolls Royce Singapore Private Ltd, Singapore 638673, Singapore

[4] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Dept Elect & Elect Engn, Suzhou 215123, Peoples R China

来源：

IEEE TRANSACTIONS ON POWER ELECTRONICS | 2023年 / 38卷 / 03期

基金：

新加坡国家研究基金会;

关键词：

Microgrids; Voltage control; Stress; Uncertainty; Minimization; Inductors; Training; Input-series output-parallel-connected dual active bridge (ISOP-DAB) converter; input voltage sharing (IVS); multiagent twin-delayed deep deterministic policy gradient (MA-TD3); triple phase shift modulation; BIDIRECTIONAL DC/DC CONVERTER; REACTIVE POWER; CONTROL STRATEGY; OPTIMIZATION; TRANSFORMER;

D O I：

10.1109/TPEL.2022.3218900

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article proposes a multiagent (MA) deep reinforcement learning (DRL) based autonomous input voltage sharing (IVS) control and triple phase shift modulation method for input-series output-parallel (ISOP) dual active bridge (DAB) converters to solve the three challenges: the uncertainties of the dc microgrid, the power balance problem, and the current stress minimization of the converter. Specifically, the control and modulation problem of the ISOP-DAB converter is formed as a Markov game with several DRL agents. Subsequently, the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm is applied to train the DRL agents in an offline manner. After the training process, the multiple agents can provide online control decisions for the ISOP-DAB converter to balance the IVS, and minimize the current stress among different submodules. Without accurate model information, the proposed method can adaptively obtain the optimal modulation variable combinations in a stochastic and uncertain environment. Simulation and experimental results verify the effectiveness of the proposed MA-TD3-based algorithm.

引用

页码：2985 / 3000

页数：16

共 1 条

[1] Reinforcement Learning Based Efficiency Optimization Scheme for the DAB DC-DC Converter With Triple-Phase-Shift Modulation
Tang, Yuanhong
Hu, Weihao
Xiao, Jian
Chen, Zhangyong
Huang, Qi
Chen, Zhe
Blaabjerg, Frede
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (08) : 7350 - 7361

← 1 →