Advanced Analytics for Train Delay Prediction Systems by Including Exogenous Weather Data

被引:26
作者
Oneto, Luca [1 ]
Fumeo, Emanuele [1 ]
Clerico, Giorgio [1 ]
Canepa, Renzo [2 ]
Papa, Federico [3 ]
Dambra, Carlo [3 ]
Mazzino, Nadia [3 ]
Anguita, Davide [1 ]
机构
[1] Univ Genoa, DIBRIS, Via Opera Pia 13, I-16145 Genoa, Italy
[2] Rete Ferroviaria Italiana SpA, Via Don Vincenzo Minetti 6-5, I-16126 Genoa, Italy
[3] Ansaldo STS SpA, Via Paolo Mantovani 3-5, I-16151 Genoa, Italy
来源
PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016) | 2016年
关键词
EXTREME LEARNING-MACHINE; MODEL; CLASSIFICATION; NETWORKS; BOUNDS;
D O I
10.1109/DSAA.2016.57
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
State-of-the-art train delay prediction systems neither exploit historical data about train movements, nor exogenous data about phenomena that can affect railway operations. They rely, instead, on static rules built by experts of the railway infrastructure based on classical univariate statistics. The purpose of this paper is to build a data-driven train delay prediction system that exploits the most recent analytics tools. The train delay prediction problem has been mapped into a multivariate regression problem and the performance of kernel methods, ensemble methods and feed-forward neural networks have been compared. Firstly, it is shown that it is possible to build a reliable and robust data-driven model based only on the historical data about the train movements. Additionally, the model can be further improved by including data coming from exogenous sources, in particular the weather information provided by national weather services. Results on real world data coming from the Italian railway network show that the proposal of this paper is able to remarkably improve the current state-of-the-art train delay prediction systems. Moreover, the performed simulations show that the inclusion of weather data into the model has a significant positive impact on its performance.
引用
收藏
页码:458 / 467
页数:10
相关论文
共 70 条
  • [1] Anguita D., 2011, INT JOINT C NEUR NET
  • [2] Anguita D., 2011, ESANN
  • [3] Anguita D., 2013, ESANN, P437
  • [4] In-Sample and Out-of-Sample Model Selection and Error Estimation for Support Vector Machines
    Anguita, Davide
    Ghio, Alessandro
    Oneto, Luca
    Ridella, Sandro
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (09) : 1390 - 1406
  • [5] Anguita D, 2011, 2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), P1169, DOI 10.1109/IJCNN.2011.6033356
  • [6] [Anonymous], 1995, NEURAL NETWORKS PATT
  • [7] [Anonymous], 2016, The Journal of Machine Learning Research, DOI DOI 10.1145/2882903.2912565
  • [8] [Anonymous], 2007, PAC BAYESIAN SUPERVI
  • [9] A survey of cross-validation procedures for model selection
    Arlot, Sylvain
    Celisse, Alain
    [J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
  • [10] Energy-Efficient Locomotive Operation for Chinese Mainline Railways by Fuzzy Predictive Control
    Bai, Yun
    Ho, Tin Kin
    Mao, Baohua
    Ding, Yong
    Chen, Shaokuan
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2014, 15 (03) : 938 - 948