Smart Train Operation Algorithms Based on Expert Knowledge and Reinforcement Learning

被引:39
作者
Zhou, Kaichen [1 ,2 ]
Song, Shiji [3 ,4 ]
Xue, Anke [5 ]
You, Keyou [3 ,4 ]
Wu, Hui [3 ,4 ]
机构
[1] Tsinghua Univ, Beijing 100084, Peoples R China
[2] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England
[3] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[4] Tsinghua Univ, BNRist, Beijing 100084, Peoples R China
[5] Hangzhou Dianzi Univ, Inst Informat & Control, Key Lab IOT & Informat Fus Technol Zhejiang, Hangzhou 310018, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2022年 / 52卷 / 02期
基金
中国国家自然科学基金;
关键词
Public transportation; Resistance; Inference algorithms; Rail transportation; Mathematical model; Force; Safety; Expert knowledge; reinforcement learning; smart train operation (STO); subway; HIGH-SPEED RAILWAY; OPTIMIZATION; STABILITY; SYSTEMS; SUBWAY; MODEL;
D O I
10.1109/TSMC.2020.3000073
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
During decades, the automatic train operation (ATO) system has been gradually adopted in many subway systems for its low-cost and intelligence. This article proposes two smart train operation (STO) algorithms by integrating the expert knowledge with reinforcement learning algorithms. Compared with previous works, the proposed algorithms can realize the control of continuous action for the subway system and optimize multiple critical objectives without using an offline speed profile. First, through learning historical data of experienced subway drivers, we extract the expert knowledge rules and build inference methods to guarantee the riding comfort, the punctuality, and the safety of the subway system. Then we develop two algorithms for optimizing the energy efficiency of train operation. One is the STO algorithm based on deep deterministic policy gradient named (STOD) and the other is the STO algorithm based on normalized advantage function (STON). Finally, we verify the performance of proposed algorithms via some numerical simulations with the real field data from the Yizhuang Line of the Beijing Subway and illustrate that the developed STO algorithm are better than expert manual driving and existing ATO algorithms in terms of energy efficiency. Moreover, STOD and STON can adapt to different trip times and different resistance conditions.
引用
收藏
页码:716 / 727
页数:12
相关论文
共 38 条
[1]  
Abbeel P., 2007, ADV NEURAL INFORM PR, P1
[2]   Coasting point optimisation for mass rail transit lines using artificial neural networks and genetic algorithms [J].
Acikbas, S. ;
Soylemez, M. T. .
IET ELECTRIC POWER APPLICATIONS, 2008, 2 (03) :172-182
[3]   Energy-efficient train control: From local convexity to global optimization and uniqueness [J].
Albrecht, Amie R. ;
Howlett, Phil G. ;
Pudney, Peter J. ;
Vu, Xuan .
AUTOMATICA, 2013, 49 (10) :3072-3078
[4]   Online Learning Algorithms for Train Automatic Stop Control Using Precise Location Data of Balises [J].
Chen, Dewang ;
Chen, Rong ;
Li, Yidong ;
Tang, Tao .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2013, 14 (03) :1526-1535
[5]   Intelligent driving methods based on expert knowledge and online optimization for high-speed trains [J].
Cheng, Ruijun ;
Chen, Dewang ;
Cheng, Bao ;
Zheng, Song .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 87 :228-239
[6]   A Review of Online Dynamic Models and Algorithms for Railway Traffic Management [J].
Corman, Francesco ;
Meng, Lingyun .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (03) :1274-1284
[7]   Application of reinforcement learning in robot soccer [J].
Duan, Yong ;
Liu, Qiang ;
Xu, Xinhe .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (07) :936-950
[8]   Power systems stability control: Reinforcement learning framework [J].
Ernst, D ;
Glavic, M ;
Wehenkel, L .
IEEE TRANSACTIONS ON POWER SYSTEMS, 2004, 19 (01) :427-435
[9]   Energy-Efficient Train Operation in Urban Rail Transit Using Real-Time Traffic Information [J].
Gu, Qing ;
Tang, Tao ;
Cao, Fang ;
Song, Yong-duan .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2014, 15 (03) :1216-1233
[10]  
Gu SX, 2016, PR MACH LEARN RES, V48