Train timetabling with the general learning environment and multi-agent deep reinforcement learning

被引：32

作者：

Li, Wenqing ^{[1
]}

Ni, Shaoquan ^{[1
,2
,3
]}

机构：

[1] Southwest Jiaotong Univ, Sch Transportat & Logist, Chengdu 610031, Peoples R China

[2] Southwest JiaoTong Univ, Natl Railway Train Timetable Res & Training Ctr, Chengdu 610031, Peoples R China

[3] Southwest JiaoTong Univ, Natl & Local Joint Engn Lab Comprehens Intelligent, Chengdu 610031, Peoples R China

来源：

TRANSPORTATION RESEARCH PART B-METHODOLOGICAL | 2022年 / 157卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Train timetabling; Railway system; Multi-agent actor -critic algorithm; Deep reinforcement learning; SCHEDULING TRAINS; NEURAL-NETWORKS; LEVEL; GO; OPTIMIZATION; ALGORITHMS; GAME;

D O I：

10.1016/j.trb.2022.02.006

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

This paper proposes a multi-agent deep reinforcement learning approach for the train timetabling problem of different railway systems. A general train timetabling learning environment is constructed to model the problem as a Markov decision process, in which the objectives and complex constraints of the problem can be distributed naturally and elegantly. Through subtle changes, the environment can be flexibly switched between the widely used double-track railway system and the more complex single-track railway system. To address the curse of dimensionality, a multi agent actor-critic algorithm framework is proposed to decompose the large-size combinatorial decision space into multiple independent ones, which are parameterized by deep neural networks. The proposed approach was tested using a real-world instance and several test instances. Experimental results show that cooperative policies of the single-track train timetabling problem can be obtained by the proposed method within a reasonable computing time that outperforms several prevailing methods in terms of the optimality of solutions, and the proposed method can be easily generalized to the double-track train timetabling problem by changing the environment slightly.

引用

页码：230 / 251

页数：22

共 37 条

[1] A State-of-the-Art Survey on Deep Learning Theory and Architectures [J].

Alom, Md Zahangir ;

Taha, Tarek M. ;

Yakopcic, Chris ;

Westberg, Stefan ;

Sidike, Paheding ;

Nasrin, Mst Shamima ;

Hasan, Mahmudul ;

Van Essen, Brian C. ;

Awwal, Abdul A. S. ;

Asari, Vijayan K. .

ELECTRONICS, 2019, 8 (03)

[2] Railway timetabling using Lagrangian relaxation [J].

Brannlund, U ;

Lindberg, PO ;

Nou, A ;

Nilsson, JE .

TRANSPORTATION SCIENCE, 1998, 32 (04) :358-369

[3] A comprehensive survey of multiagent reinforcement learning [J].

Busoniu, Lucian ;

Babuska, Robert ;

De Schutter, Bart .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02) :156-172

[4] Approaches to a real-world Train Timetabling Problem in a railway node [J].

Cacchiani, Valentina ;

Furini, Fabio ;

Kidd, Martin Philip .

OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2016, 58 :97-110

[5] Nominal and robust train timetabling problems [J].

Cacchiani, Valentina ;

Toth, Paolo .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 219 (03) :727-737

[6] Scheduling extra freight trains on railway networks [J].

Cacchiani, Valentina ;

Caprara, Alberto ;

Toth, Paolo .

TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2010, 44 (02) :215-231

[7] A FAST HEURISTIC FOR THE TRAIN SCHEDULING PROBLEM [J].

CAI, X ;

GOH, CJ .

COMPUTERS & OPERATIONS RESEARCH, 1994, 21 (05) :499-510

[8] A Lagrangian heuristic algorithm for a real-world train timetabling problem [J].

Caprara, A ;

Monaci, M ;

Toth, P ;

Guida, PL .

DISCRETE APPLIED MATHEMATICS, 2006, 154 (05) :738-753

[9] Modeling and solving the train timetabling problem [J].

Caprara, A ;

Fischetti, M ;

Toth, P .

OPERATIONS RESEARCH, 2002, 50 (05) :851-861

[10] Scheduling trains on a network of busy complex stations [J].

Carey, Malachy ;

Crawford, Ivan .

TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2007, 41 (02) :159-178

← 1 2 3 4 →