Data-Driven Coordinated Charging for Electric Vehicles With Continuous Charging Rates: A Deep Policy Gradient Approach

被引:32
作者
Jiang, Yuxuan [1 ]
Ye, Qiang [1 ]
Sun, Bo [2 ]
Wu, Yuan [3 ,4 ]
Tsang, Danny H. K. [2 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 4R2, Canada
[2] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[3] Univ Macau, State Key Lab Internet Things Smart City, Macau, Peoples R China
[4] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Electric vehicle charging; Internet of Things; Reinforcement learning; Analytical models; Training; Vehicle-to-grid; Numerical models; Coordinated charging; deep policy gradient; electric vehicle (EV);
D O I
10.1109/JIOT.2021.3135977
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we consider a parking lot that manages the charging processes of its parked electric vehicles (EVs). Upon arrival, each EV requests a certain amount of energy. This request should be fulfilled before the EV's departure. It is of critical importance to coordinate the EVs' charging rates to smooth out the load profile of the parking lot because inappropriate charging rates can lead to sharp spikes and fluctuations on the load profile, imposing negative effects on the power grid. Meanwhile, empirical studies show that many parking lots exhibit statistical patterns on EV dynamics. For example, the bulk of EVs arrives during rush hours. Therefore, in this article, we incorporate such patterns into charging rate coordination. Although the statistical patterns can be summarized from historical data, they are difficult to be analytically modeled. As a result, we adopt a model-free deep reinforcement learning approach. We also take the latest continuous charging rate control technology into consideration. The decision variables are thus continuous and a policy gradient algorithm is needed to perform reinforcement learning. Technically, we first formulate the problem as a Markov decision process (MDP) with unknown state transition probabilities. To further derive a deep policy gradient algorithm, the challenge lies in the inconsistent and state-dependent action space of the MDP model, due to the constraint to satisfy EVs' energy demands before their scheduled departure. To tackle the challenge, we design a customized model for neural network training by extending the action space to be consistent and state independent, and revise the reward function to penalize the neural network output if it is beyond the action space of the original MDP model. With this customized model, we then develop a deep policy gradient algorithm based on the proximal policy gradient framework. Numerical results show that our algorithm outperforms the benchmarks.
引用
收藏
页码:12395 / 12412
页数:18
相关论文
共 49 条
[1]  
Achiam J, 2017, PR MACH LEARN RES, V70
[2]   Online EV Scheduling Algorithms for Adaptive Charging Networks with Global Peak Constraints [J].
Alinia, Bahram ;
Hajiesmaili, Mohammad H. ;
Lee, Zachary J. ;
Crespi, Noel ;
Mallada, Enrique .
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2022, 7 (03) :537-548
[3]   Online EV Charging Scheduling With On-Arrival Commitment [J].
Alinia, Bahram ;
Hajiesmaili, Mohammad H. ;
Crespi, Noel .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (12) :4524-4537
[4]   Hierarchical Electric Vehicle Charging Aggregator Strategy Using Dantzig-Wolfe Decomposition [J].
Amini, M. Hadi ;
Mcnamara, Paul ;
Weng, Paul ;
Karabasoglu, Orkun ;
Xu, Yinliang .
IEEE DESIGN & TEST, 2018, 35 (06) :25-36
[5]   Scalable Real-Time Electric Vehicles Charging With Discrete Charging Rates [J].
Binetti, Giulio ;
Davoudi, Ali ;
Naso, David ;
Turchiano, Biagio ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON SMART GRID, 2015, 6 (05) :2211-2220
[6]  
Bo Sun, 2020, e-Energy '20: Proceedings of the Eleventh ACM International Conference on Future Energy Systems, P144, DOI 10.1145/3396851.3397727
[7]   Age of Information Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective [J].
Chen, Xianfu ;
Wu, Celimuge ;
Chen, Tao ;
Zhang, Honggang ;
Liu, Zhi ;
Zhang, Yan ;
Bennis, Mehdi .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (04) :2268-2281
[8]   Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach [J].
Chen, Xianfu ;
Zhao, Zhifeng ;
Wu, Celimuge ;
Bennis, Mehdi ;
Liu, Hang ;
Ji, Yusheng ;
Zhang, Honggang .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (10) :2377-2392
[9]   Optimal Electric Vehicle Charging Strategy With Markov Decision Process and Reinforcement Learning Technique [J].
Ding, Tao ;
Zeng, Ziyu ;
Bai, Jiawen ;
Qin, Boyu ;
Yang, Yongheng ;
Shahidehpour, Mohammad .
IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2020, 56 (05) :5811-5823
[10]   Deep Learning Empowered Traffic Offloading in Intelligent Software Defined Cellular V2X Networks [J].
Fan, Bo ;
He, Zhengbing ;
Wu, Yuan ;
He, Jia ;
Chen, Yanyan ;
Jiang, Li .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (11) :13328-13340