Autonomous Input Voltage Sharing Control and Triple Phase Shift Modulation Method for ISOP-DAB Converter in DC Microgrid: A Multiagent Deep Reinforcement Learning-Based Method

被引:35
|
作者
Zeng, Yu [1 ]
Pou, Josep [1 ]
Sun, Changjiang [2 ]
Mukherjee, Suvajit [3 ]
Xu, Xu [2 ,4 ]
Gupta, Amit Kumar [3 ]
Dong, Jiaxin [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore 639798, Singapore
[3] Rolls Royce Singapore Private Ltd, Singapore 638673, Singapore
[4] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Dept Elect & Elect Engn, Suzhou 215123, Peoples R China
基金
新加坡国家研究基金会;
关键词
Microgrids; Voltage control; Stress; Uncertainty; Minimization; Inductors; Training; Input-series output-parallel-connected dual active bridge (ISOP-DAB) converter; input voltage sharing (IVS); multiagent twin-delayed deep deterministic policy gradient (MA-TD3); triple phase shift modulation; BIDIRECTIONAL DC/DC CONVERTER; REACTIVE POWER; CONTROL STRATEGY; OPTIMIZATION; TRANSFORMER;
D O I
10.1109/TPEL.2022.3218900
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article proposes a multiagent (MA) deep reinforcement learning (DRL) based autonomous input voltage sharing (IVS) control and triple phase shift modulation method for input-series output-parallel (ISOP) dual active bridge (DAB) converters to solve the three challenges: the uncertainties of the dc microgrid, the power balance problem, and the current stress minimization of the converter. Specifically, the control and modulation problem of the ISOP-DAB converter is formed as a Markov game with several DRL agents. Subsequently, the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm is applied to train the DRL agents in an offline manner. After the training process, the multiple agents can provide online control decisions for the ISOP-DAB converter to balance the IVS, and minimize the current stress among different submodules. Without accurate model information, the proposed method can adaptively obtain the optimal modulation variable combinations in a stochastic and uncertain environment. Simulation and experimental results verify the effectiveness of the proposed MA-TD3-based algorithm.
引用
收藏
页码:2985 / 3000
页数:16
相关论文
共 1 条
  • [1] Reinforcement Learning Based Efficiency Optimization Scheme for the DAB DC-DC Converter With Triple-Phase-Shift Modulation
    Tang, Yuanhong
    Hu, Weihao
    Xiao, Jian
    Chen, Zhangyong
    Huang, Qi
    Chen, Zhe
    Blaabjerg, Frede
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (08) : 7350 - 7361