Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus

被引：324

作者：

Wu, Jingda

He, Hongwen

Peng, Jiankun ^{[1
]}

Li, Yuecheng ^{[1
]}

Li, Zhanjiang

机构：

[1] Beijing Inst Technol, Sch Mech Engn, Natl Engn Lab Elect Vehicles, Beijing 100081, Peoples R China

来源：

APPLIED ENERGY | 2018年 / 222卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Energy management strategy; Continuous reinforcement learning; Deep Q learning; Dynamic programming; Hybrid electric bus; MODEL-PREDICTIVE CONTROL; STRATEGY; VEHICLES; ALGORITHM; DESIGN; HEV;

D O I：

10.1016/j.apenergy.2018.03.104

中图分类号：

TE [石油、天然气工业]; TK [能源与动力工程];

学科分类号：

0807 ; 0820 ;

摘要：

Reinforcement learning is a new research hotspot in the artificial intelligence community. Q learning as a famous reinforcement learning algorithm can achieve satisfactory control performance without need to clarify the complex internal factors in controlled objects. However, discretization state is necessary which limits the application of Q learning in energy management for hybrid electric bus (HEB). In this paper the deep Q learning (DQL) is adopted for energy management issue and the strategy is proposed and verified. Firstly, the system modeling of bus configuration are described. Then, the energy management strategy based on deep Q learning is put forward. Deep neural network is employed and well trained to approximate the action value function (Q function). Furthermore, the Q learning strategy based on the same model is mentioned and applied to compare with deep Q learning. Finally, a part of trained decision network is analyzed separately to verify the effectiveness and rationality of the DQL-based strategy. The training results indicate that DQL-based strategy makes a better performance than that of Q learning in training time consuming and convergence rate. Results also demonstrate the fuel economy of proposed strategy under the unknown driving condition achieves 89% of dynamic programming-based method. In addition, the technique can finally learn to the target state of charge under different initial conditions. The main contribution of this study is to explore a novel reinforcement learning methodology into energy management for HEB which solve the curse of state variable dimensionality, and the techniques can be adopted to solve similar problems.

引用

页码：799 / 811

页数：13

共 34 条

[1] Particle Swarm Optimization of Coupled Electromechanical Systems [J].