Imputation of trip data for a docked bike-sharing system

被引:1
作者
Thomas, Milan Mathew [1 ]
Vernia, Ashish [2 ]
Mayakuntla, Sai Kiran [3 ]
机构
[1] Rajiv Gandhi Inst Technol, Dept Civil Engn, Kottayam 686501, Kerala, India
[2] Indian Inst Sci, Dept Civil Engn, Bengaluru 560012, India
[3] Univ Chile, Dept Civil Engn, Transport Div, Santiago, Chile
来源
CURRENT SCIENCE | 2022年 / 122卷 / 03期
关键词
Bike-sharing system; imputation; incomplete records; origin and destination; probabilistic and machine learning approaches; trip data; MISSING DATA;
D O I
10.18520/cs/v122/i3/310-318
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mobile application-based transportation services are reshaping the urban transportation industries of both the developed and developing worlds. They generate massive amounts of data, which have the potential to provide deeper insights into urban travel activity than ever before. The bike-sharing service (BSS) market is growing at a breakneck pace with new service providers entering the arena. However, we have seen the failure of several BSS start-ups in India in recent years. All these cases have one aspect in common: user dissatisfaction because of insufficient/ineffective rebalancing approaches. The BSS operators rely on data insights to drive their policies and strategies. However, the data generated by these services are found to have several incomplete records as a result of various technical errors, like missing origin/destination. As most BSS modelling focuses on trip origin and destination, completely ignoring (or listwise deleting) trips with missing information results in the loss of valuable data that are still present in other observed variables, which include trip duration, date and time of the trip, and so on. This study proposes two methods for imputing missing data: (i) a probabilistic approach based on based on the k-nearest neighbor algorithm. The methodologies for their analyses are presented in detail. Data from a BSS that operated in the Indian Institute of Science campus, Bengaluru, India, are used to illust rate the proposed approaches. This is followed by a brief discussion of the results and a comparison of the performance.
引用
收藏
页码:310 / 318
页数:9
相关论文
共 20 条
[1]  
Acuna E., 2004, P M INT FED CLASS SO, P15, DOI [10.1007/978-3-642-17103-1_60, DOI 10.1007/978-3-642-17103-1_60]
[2]  
Badr Will., 2019, 6 different ways to compensate for missing values in a dataset
[3]  
Batista G.E., 2002, His, V87, P48
[4]  
Buhi ER, 2008, AM J HEALTH BEHAV, V32, P83
[6]   Comparison of advanced imputation algorithms for detection of transportation mode and activity episode using GPS data [J].
Feng, Tao ;
Timmermans, Harry J. P. .
TRANSPORTATION PLANNING AND TECHNOLOGY, 2016, 39 (02) :180-194
[7]  
GarciaLaencina P. J, 2010, HDB RES MACHINE LEAR, P147, DOI [10.4018/978-1-60566- 766-9.ch007, DOI 10.4018/978-1-60566-766-9.CH007]
[8]   Role of Bikeshare Programs in Transit-Oriented Development: Case of Birmingham, Alabama [J].
Glass, Caroline ;
Appiah-Opoku, Seth ;
Weber, Joe ;
Jones, Steven L., Jr. ;
Chan, Amber ;
Oppong, Judith .
JOURNAL OF URBAN PLANNING AND DEVELOPMENT, 2020, 146 (02)
[9]   Calibrated Bayes, for Statistics in General, and Missing Data in Particular [J].
Little, Roderick .
STATISTICAL SCIENCE, 2011, 26 (02) :162-174
[10]  
Liu X, 2016, METHODS AND APPLICATIONS OF LONGITUDINAL DATA ANALYSIS, P441, DOI 10.1016/B978-0-12-801342-7.00014-9