Synthetic Generation of Trip Data: The Case of Smart Card

被引:0
作者
Minh Kieu
Iris Brighid Meredith
Andrea Raith
机构
[1] Department of Civil and Environmental Engineering, University of Auckland, Auckland
[2] Department of Engineering Science, University of Auckland, Auckland
来源
Data Science for Transportation | 2023年 / 5卷 / 2期
关键词
Bayesian Network; Generative Adversarial Network; Smart Card data; Synthetic data;
D O I
10.1007/s42421-023-00079-6
中图分类号
学科分类号
摘要
While individual data are key for epidemiology, social simulation, economics, and various other fields, data owners are increasingly required to protect the personally identifiable information from data. Simple data de-identification or ‘data masking’ measures are limited, because they both reduce the utility of the dataset and are not sufficient to protect individual confidentiality. This paper provides detail on the creation of a synthetic trip data in transportation, with the Smart Card data as the case study. It discusses and compares two machine learning methods, a Generative Adversarial Network and a Bayesian Network for modelling and generating this dataset. The synthetic data retain important utility of the real dataset, e.g., the origin, destination, and time of travel, while each data point does not represent a real trip in the original dataset. The synthetic dataset can be used in various applications, including microsimulation of public transport systems, analysing travel behaviours, model predictive control of transit flows, or evaluation of transport policies. © The Author(s) 2023.
引用
收藏
相关论文
共 40 条
[1]  
Ahmed G., Malick R.A.S., Akhunzada A., Zahid S., Sagri M.R., Gani A., An approach towards IoT-based predictive service for early detection of diseases in poultry chickens, Sustainability, 13, 23, (2021)
[2]  
Axhausen K.W., Garling T., Activity-based approaches to travel analysis: conceptual frameworks, models, and research problems, Transp Rev, 12, 4, pp. 323-341, (1992)
[3]  
Badu-Marfo G., Farooq B., Patterson Z., A differentially private multi-output deep generative networks approach for activity diary synthesis. arXiv preprint arXiv, 2012, (2020)
[4]  
Bengio Y., Thibodeau-Laufer &#X.009
[5]  
., Alain G., Yosinski J., Deep generative stochastic networks trainable by backprop, arXiv preprint arXiv:1306.1091 [cs], (2014)
[6]  
Bouman P.C., Kroon L.G., Schobel A., Vervest P.H.M., ) Passengers, crowding and complexity: models for passenger oriented public transport, (2017)
[7]  
Briot J.-P., Hadjeres G., Pachet F.-D., Deep learning techniques for music generation, Computational Synthesis and Creative Systems, (2020)
[8]  
Choi S., Kim J., Yeo H., TrajGAIL: generating urban vehicle trajectories using generative adversarial imitation learning. arXiv preprint arXiv:2007.14189 [cs, stat], (2021)
[9]  
Deeva I., Andriushchenko P.D., Kalyuzhnaya A.V., Boukhanovsky A.V., Bayesian networks-based personal data synthesis. In: Proceedings of the 6th EAI international conference on smart objects and technologies for social good, 6–11, (2020)
[10]  
Drechsler J., Reiter J.P., An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets, Comput Stat Data Anal, 55, 12, pp. 3232-3243, (2011)