Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis

被引:2
|
作者
Huang, Liping [1 ]
Li, Zhenghuan [1 ]
Luo, Ruikang [1 ]
Su, Rong [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
关键词
missing data; urban traffic sensing; probabilistic; principal component analysis; PREDICTION;
D O I
10.3390/s23010204
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Generalized probabilistic principal component analysis of correlated data
    Gu, Mengyang
    Shen, Weining
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [42] A Linear Model Based on Principal Component Analysis for Disease Prediction
    Roopa, H.
    Asha, T.
    IEEE ACCESS, 2019, 7 : 105314 - 105318
  • [43] A Combination of Multiple Imputation and Principal Component Analysis to Handle Missing Value with Arbitrary Pattern
    Anindita, Novita
    Nugroho, Hanung Adi
    Adji, Teguh Bharata
    2017 7TH INTERNATIONAL ANNUAL ENGINEERING SEMINAR (INAES), 2017, : 1 - 5
  • [44] Missing data imputation for traffic congestion data based on joint matrix factorization
    Jia, Xiaoyi
    Dong, Xiaoyu
    Chen, Meng
    Yu, Xiaohui
    KNOWLEDGE-BASED SYSTEMS, 2021, 225
  • [45] Evaluation Model of Region Traffic Safety Based on Principal Component Analysis
    Li, Qiangwei
    I2MTC: 2009 IEEE INSTRUMENTATION & MEASUREMENT TECHNOLOGY CONFERENCE, VOLS 1-3, 2009, : 230 - 233
  • [46] Urban Network Travel Time Prediction Based on a Probabilistic Principal Component Analysis Model of Probe Data
    Jenelius, Erik
    Koutsopoulos, Haris N.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (02) : 436 - 445
  • [47] Missing Data Imputation in Transformer District Based on Improved Generative Adversarial Network
    Liu K.
    Zhou F.
    Zhou H.
    Wang C.
    Dianwang Jishu/Power System Technology, 2022, 46 (08): : 3231 - 3239
  • [48] Missing Values Imputation Using Genetic Algorithm for the Analysis of Traffic Data
    Midde, Ranjit Reddy
    Srinivasa, K. G.
    Reddy, Eswara B.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2017, 2018, 668 : 251 - 261
  • [49] Differential privacy data publishing method based on the probabilistic principal component analysis
    Gu Z.
    Zhang G.
    Ma C.
    Song L.
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2021, 42 (08): : 1217 - 1223
  • [50] A probabilistic generative model to discover the treatments of coexisting diseases with missing data
    Zaballa, Onintze
    Pérez, Aritz
    Gómez-Inhiesto, Elisa
    Acaiturri-Ayesta, Teresa
    Lozano, Jose A.
    Computer Methods and Programs in Biomedicine, 2024, 243