A supervised machine learning model for imputing missing boarding stops in smart card data

被引:1
作者
Shalit, Nadav [1 ]
Fire, Michael [1 ]
Ben-Elia, Eran [2 ]
机构
[1] Ben Gurion Univ Negev, Dept Software & Informat Syst Engn, Lab Data4Good, Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Dept Geog & Environm Dev, GAMESLab, Beer Sheva, Israel
关键词
Machine learning; Smart card; Boarding stop imputation; Public transport; Missing data; Pareto accuracy; PUBLIC TRANSPORT; DESTINATION ESTIMATION; BIG DATA; BEHAVIOR; AUTHENTICATION; PREDICTION; PATTERNS; SYSTEMS;
D O I
10.1007/s12469-022-00309-0
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.
引用
收藏
页码:287 / 319
页数:33
相关论文
共 94 条
  • [11] Bertsimas D, 2018, J MACH LEARN RES, V18
  • [12] Analyzing year-to-year changes in public transport passenger behaviour using smart card data
    Briand, Anne-Sarah
    Come, Etienne
    Trepanier, Martin
    Oukhellou, Latifa
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2017, 79 : 274 - 289
  • [13] Understanding behaviour through smartcard data analysis
    Bryan, H.
    Blythe, P.
    [J]. PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-TRANSPORT, 2007, 160 (04) : 173 - 177
  • [14] MEASURING THE PERFORMANCE OF ORDINAL CLASSIFICATION
    Cardoso, Jaime S.
    Sousa, Ricardo
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2011, 25 (08) : 1173 - 1195
  • [15] Real-Time Bus Arrival Information System: An Empirical Evaluation
    Cats, Oded
    Loutos, Gerasimos
    [J]. JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 20 (02) : 138 - 151
  • [16] New urban public transportation systems: Initiatives, effectiveness, and challenges
    Ceder, A
    [J]. JOURNAL OF URBAN PLANNING AND DEVELOPMENT, 2004, 130 (01) : 56 - 65
  • [17] Ceder A, 2015, PUBLIC TRANSIT PLANNING AND OPERATION: MODELING, PRACTICE AND BEHAVIOR, 2ND EDITION, P1
  • [18] The promises of big data and small data for travel behavior (aka human mobility) analysis
    Chen, Cynthia
    Ma, Jingtao
    Susilo, Yusak
    Liu, Yu
    Wang, Menglin
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2016, 68 : 285 - 299
  • [19] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [20] Extracting bus transit boarding stop information using smart card transaction data
    Chen Z.
    Fan W.
    [J]. Journal of Modern Transportation, 2018, 26 (3): : 209 - 219