Reinforcement learning method for the multi-objective speed trajectory optimization of a freight train

被引:13
作者
Lin, Xuan [1 ]
Liang, Zhicheng [2 ]
Shen, Lijuan [1 ]
Zhao, Fengyuan [3 ]
Liu, Xinyu [4 ]
Sun, Pengfei [5 ]
Cao, Taiqiang [1 ]
机构
[1] Xihua Univ, Sch Elect Engn & Elect Informat, Chengdu 610039, Sichuan, Peoples R China
[2] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[3] Southwest Petr Univ, Sch Elect Engn & Informat, Chengdu 610500, Sichuan, Peoples R China
[4] China Railway Design Corp, Tianjin 300300, Peoples R China
[5] Southwest Jiaotong Univ, Sch Elect Engn, Chengdu 611756, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Multi-objective optimization; ITO algorithm; ENERGY-EFFICIENT OPERATION; DRIVING STRATEGY; COAST CONTROL; RAILWAY; TIME; CONSUMPTION; PROFILES; MOVEMENT; DESIGN; SYSTEM;
D O I
10.1016/j.conengprac.2023.105605
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rising urge to mitigate the green house effect, reducing the energy consumption of the freight train attracts much attention. Multiple constraints should be taken into account to solve the energy-efficient control problem, which can be reformulated as the multi-objective optimization. This paper proposes a Reinforcement Learning (RL) method for the multi-objective speed trajectory optimization with the goal of the energy -efficiency, punctuality and accurate parking simultaneously. Since the solution space for the optimization problem in this paper is large and discrete, a Gated Recurrent Unit (GRU)-based network is proposed to achieve the fast approximation of the optimal value function instead of the lookup Q-table. Meanwhile, a new architecture, including the embedding matrix, is used to model the control sequence that generates the speed trajectory. Besides, this paper constructs a Deep Q-Network (DQN) framework to train the GRU network without relying on the prior knowledge of the freight train model. Finally, the Intelligent Train Operation (ITO) algorithm is proposed and verified, using the data of Beijing-Guangzhou Railway Line and HXD1B electric locomotive. The case studies indicate that the reward function for the ITO algorithm converges rapidly and the energy consumption monotonically decreases with the trip time, which satisfies the multiple optimization objectives. In terms of saving the energy consumption, the ITO algorithm performs better than Fuzzy Predictive Control (FPC), Genetic Algorithm (GA) and the field test data. The computation time of different speed trajectories demonstrates that the ITO algorithm is applicable to generating the optimal speed trajectory off-line.
引用
收藏
页数:16
相关论文
共 49 条
[1]   Using quantum amplitude amplification in genetic algorithms [J].
Acampora, Giovanni ;
Schiattarella, Roberto ;
Vitiello, Autilia .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 209
[2]   The key principles of optimal train control-Part 1: Formulation of the model, strategies of optimal type, evolutionary lines, location of optimal switching points [J].
Albrecht, Arnie ;
Howlett, Phil ;
Pudney, Peter ;
Vu, Xuan ;
Zhou, Peng .
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2016, 94 :482-508
[3]   Enhancing energy management of a stationary energy storage system in a DC electric railway using fuzzy logic control [J].
Alnuman, Hammad H. ;
Gladwin, Daniel T. ;
Foster, Martin P. ;
Ahmed, Emad M. .
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 142
[4]   Energy-Efficient Locomotive Operation for Chinese Mainline Railways by Fuzzy Predictive Control [J].
Bai, Yun ;
Ho, Tin Kin ;
Mao, Baohua ;
Ding, Yong ;
Chen, Shaokuan .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2014, 15 (03) :938-948
[5]   Energy storage systems to exploit regenerative braking in DC railway systems: Different approaches to improve efficiency of modern high-speed trains [J].
Ceraolo, M. ;
Lutzemberger, G. ;
Meli, E. ;
Pugi, L. ;
Rindi, A. ;
Pancari, G. .
JOURNAL OF ENERGY STORAGE, 2018, 16 :269-279
[6]   Optimising train movements through coast control using genetic algorithms [J].
Chang, CS ;
Sim, SS .
IEE PROCEEDINGS-ELECTRIC POWER APPLICATIONS, 1997, 144 (01) :65-73
[7]   Energy saving in railway timetabling: A bi-objective evolutionary approach for computing alternative running times [J].
Chevrier, Remy ;
Pellegrini, Paola ;
Rodriguez, Joaquin .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2013, 37 :20-41
[8]   Fuzzy optimal schedule of high speed train operation to minimize energy consumption with uncertain delays and driver's behavioral response [J].
Cucala, A. P. ;
Fernandez, A. ;
Sicre, C. ;
Dominguez, M. .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2012, 25 (08) :1548-1557
[9]   Multi objective particle swarm optimization algorithm for the design of efficient ATO speed profiles in metro lines [J].
Dominguez, Maria ;
Fernandez-Cardador, Antonio ;
Cucala, Asuncion P. ;
Gonsalves, Tad ;
Fernandez, Adrian .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 29 :43-53
[10]   Real time eco-driving of high speed trains by simulation-based dynamic multi-objective optimization [J].
Fernandez-Rodriguez, Adrian ;
Fernandez-Cardador, Antonio ;
Cucala, Asuncion P. .
SIMULATION MODELLING PRACTICE AND THEORY, 2018, 84 :50-68