The Simpler The Better: A Unified Approach to Predicting Original Taxi Demands based on Large-Scale Online Platforms

被引:233
作者
Tong, Yongxin [1 ]
Chen, Yuqiang [2 ]
Zhou, Zimu [3 ]
Chen, Lei [4 ]
Wang, Jie [5 ]
Yang, Qiang [2 ,4 ]
Ye, Jieping [5 ]
Lv, Weifeng [1 ]
机构
[1] Beihang Univ, SKLSDE Lab, Beijing, Peoples R China
[2] 4Paradigm Inc, Beijing, Peoples R China
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[5] Didi Res Inst, Beijing, Peoples R China
来源
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2017年
关键词
Unit Original Taxi Demands; Prediction; Feature Engineering;
D O I
10.1145/3097983.3098018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Taxi-calling apps are gaining increasing popularity for their efficiency in dispatching idle taxis to passengers in need. To precisely balance the supply and the demand of taxis, online taxicab platforms need to predict the Unit Original Taxi Demand (UOTD), which refers to the number of taxi-calling requirements submitted per unit time (e.g., every hour) and per unit region (e.g., each POI). Predicting UOTD is non-trivial for large-scale industrial online taxicab platforms because both accuracy and flexibility are essential. Complex non-linear models such as GBRI: and deep learning are generally accurate, yet require labor-intensive model redesign after scenario changes (e.g., extra constraints due to new regulations). To accurately predict UOTD while remaining flexible to scenario changes, we propose LinUOTD, a unified linear regression model with more than 200 million dimensions of features. The simple model structure eliminates the need of repeated model redesign, while the high-dimensional features contribute to accurate UOTD prediction. We further design a series of optimization techniques for efficient model training and updating. Evaluations on two largescale datasets from an industrial online taxicab platform verify that LinUOTD outperforms popular non-linear models in accuracy. We envision our experiences to adopt simple linear models with high-dimensional features in UOTD prediction as a pilot study and can shed insights upon other industrial large-scale spatio-temporal predict ion problems.
引用
收藏
页码:1653 / 1662
页数:10
相关论文
共 27 条
[1]  
[Anonymous], 2012, ADADELTA ADAPTIVE LE
[2]  
[Anonymous], 2012, INTELLIGENT INTERACT
[3]  
[Anonymous], 1998, STAT LEARNING THEORY
[4]  
[Anonymous], 2001, ANN STAT
[5]  
[Anonymous], 2014, 11 USENIX S OPERATIN
[6]  
Anwar A, 2013, IEEE INT C INTELL TR, P694, DOI 10.1109/ITSC.2013.6728312
[7]  
Cover Thomas M., 2006, ELEMENTS INFORM THEO
[8]  
Dean J., 2012, Advances in. Neural Information Processing Systems, P1223
[9]  
He Xinran, 2014, P 8 INT WORKSH DAT M, P1, DOI DOI 10.1145/2648584.2648589
[10]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507