TSI-GNN: Extending Graph Neural Networks to Handle Missing Data in Temporal Settings

被引:7
作者
Gordon, David [1 ,2 ]
Petousis, Panayiotis [3 ]
Zheng, Henry [2 ]
Zamanzadeh, Davina [2 ]
Bui, Alex A. T. [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Dept Bioengn, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Radiol Sci, Med & Imaging Informat MII Grp, Los Angeles, CA 90095 USA
[3] UCLA, Clin & Translat Sci Inst, Los Angeles, CA USA
来源
FRONTIERS IN BIG DATA | 2021年 / 4卷
关键词
missing data; imputation; temporal data; irregular sampling; deep learning; graph neural networks; DATA IMPUTATION;
D O I
10.3389/fdata.2021.693869
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel approach for imputing missing data that incorporates temporal information into bipartite graphs through an extension of graph representation learning. Missing data is abundant in several domains, particularly when observations are made over time. Most imputation methods make strong assumptions about the distribution of the data. While novel methods may relax some assumptions, they may not consider temporality. Moreover, when such methods are extended to handle time, they may not generalize without retraining. We propose using a joint bipartite graph approach to incorporate temporal sequence information. Specifically, the observation nodes and edges with temporal information are used in message passing to learn node and edge embeddings and to inform the imputation task. Our proposed method, temporal setting imputation using graph neural networks (TSI-GNN), captures sequence information that can then be used within an aggregation function of a graph neural network. To the best of our knowledge, this is the first effort to use a joint bipartite graph approach that captures sequence information to handle missing data. We use several benchmark datasets to test the performance of our method against a variety of conditions, comparing to both classic and contemporary methods. We further provide insight to manage the size of the generated TSI-GNN model. Through our analysis we show that incorporating temporal information into a bipartite graph improves the representation at the 30% and 60% missing rate, specifically when using a nonlinear model for downstream prediction tasks in regularly sampled datasets and is competitive with existing temporal methods under different scenarios.
引用
收藏
页数:9
相关论文
共 44 条
[11]  
Gondara Lovedeep, 2018, Advances in Knowledge Discovery and Data Mining. 22nd Pacific-Asia Conference, PAKDD 2018. Proceedings: LNAI 10939, P260, DOI 10.1007/978-3-319-93040-4_21
[12]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[13]  
Hamilton W. L., 2020, Graph Representation Learning
[14]  
Hamilton WL, 2017, ADV NEUR IN, V30
[15]   Array programming with NumPy [J].
Harris, Charles R. ;
Millman, K. Jarrod ;
van der Walt, Stefan J. ;
Gommers, Ralf ;
Virtanen, Pauli ;
Cournapeau, David ;
Wieser, Eric ;
Taylor, Julian ;
Berg, Sebastian ;
Smith, Nathaniel J. ;
Kern, Robert ;
Picus, Matti ;
Hoyer, Stephan ;
van Kerkwijk, Marten H. ;
Brett, Matthew ;
Haldane, Allan ;
del Rio, Jaime Fernandez ;
Wiebe, Mark ;
Peterson, Pearu ;
Gerard-Marchant, Pierre ;
Sheppard, Kevin ;
Reddy, Tyler ;
Weckesser, Warren ;
Abbasi, Hameer ;
Gohlke, Christoph ;
Oliphant, Travis E. .
NATURE, 2020, 585 (7825) :357-362
[16]   Latent space approaches to social network analysis [J].
Hoff, PD ;
Raftery, AE ;
Handcock, MS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (460) :1090-1098
[17]  
Jarrett Daniel, 2020, INT C LEARN REPR
[18]   MIMIC-III, a freely accessible critical care database [J].
Johnson, Alistair E. W. ;
Pollard, Tom J. ;
Shen, Lu ;
Lehman, Li-wei H. ;
Feng, Mengling ;
Ghassemi, Mohammad ;
Moody, Benjamin ;
Szolovits, Peter ;
Celi, Leo Anthony ;
Mark, Roger G. .
SCIENTIFIC DATA, 2016, 3
[19]  
Kreindler D.M., 2012, NONLINEAR DYNAMICAL, V135, P149, DOI [10.1201/9781439820025-9, DOI 10.1201/9781439820025-9]
[20]  
Lipton Z.C., 2016, MACHINE LEARNING HEA, V56