Optimal Policy Characterization Enhanced Actor-Critic Approach for Electric Vehicle Charging Scheduling in a Power Distribution Network
被引:71
作者:
Jin, Jiangliang
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
Chinese Univ Hong Kong, Shun Hing Inst Adv Engn, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
Jin, Jiangliang
[1
,2
]
Xu, Yunjian
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
Xu, Yunjian
[1
]
机构:
[1] Chinese Univ Hong Kong, Dept Mech & Automat Engn, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Shun Hing Inst Adv Engn, Hong Kong, Peoples R China
Electric vehicle charging;
Optimal scheduling;
Stochastic processes;
Distribution networks;
Reinforcement learning;
Solar power generation;
Dynamic programming;
deep reinforcement learning;
electric vehicle charging;
actor-critic approach;
power distribution network;
DEMAND RESPONSE;
ENERGY-STORAGE;
REINFORCEMENT;
SYSTEMS;
D O I:
10.1109/TSG.2020.3028470
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
We study the scheduling of large-scale electric vehicle (EV) charging in a power distribution network under random renewable generation and electricity prices. The problem is formulated as a stochastic dynamic program with unknown state transition probability. To mitigate the curse of dimensionality, we establish the nodal multi-target (NMT) characterization of the optimal scheduling policy: all EVs with the same deadline at the same bus should be charged to approach a single target of remaining energy demand. We prove that the NMT characterization is optimal under arbitrarily random system dynamics. To adaptively learn the dynamics of system uncertainty, we propose a model-free soft-actor-critic (SAC) based method to determine the target levels for the characterized NMT policy. The proposed SAC + NMT approach significantly outperforms existing deep reinforcement learning methods (in our numerical experiments on the IEEE 37-node test feeder), as the established NMT characterization sharply reduces the dimensionality of neural network outputs without loss of optimality.
机构:
Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, CanadaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
Bahraini, Shahab
Wong, Vincent W. S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, CanadaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
Wong, Vincent W. S.
Huang, Jianwei
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Hong Kong, Peoples R ChinaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
机构:
Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, CanadaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
Bahraini, Shahab
Wong, Vincent W. S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, CanadaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
Wong, Vincent W. S.
Huang, Jianwei
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Hong Kong, Peoples R ChinaUniv British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada