Joint latent space models for network data with high-dimensional node variables

被引:9
作者
Zhang, Xuefei [1 ]
Xu, Gongjun [1 ]
Zhu, Ji [1 ]
机构
[1] Univ Michigan, Dept Stat, 1085 South Univ Ave, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
High-dimensional data; Latent space model; Network analysis; COMMUNITY DETECTION;
D O I
10.1093/biomet/asab063
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Network latent space models assume that each node is associated with an unobserved latent position in a Euclidean space, and such latent variables determine the probability of two nodes connecting with each other. In many applications, nodes in the network are often observed along with high-dimensional node variables, and these node variables provide important information for understanding the network structure. However, classical network latent space models have several limitations in incorporating node variables. In this paper, we propose a joint latent space model where we assume that the latent variables not only explain the network structure, but are also informative for the multivariate node variables. We develop a projected gradient descent algorithm that estimates the latent positions using a criterion incorporating both network structure and node variables. We establish theoretical properties of the estimators and provide insights into how incorporating high-dimensional node variables could improve the estimation accuracy of the latent positions. We demonstrate the improvement in latent variable estimation and the improvements in associated downstream tasks, such as missing value imputation for node variables, by simulation studies and an application to a Facebook data example.
引用
收藏
页码:707 / 720
页数:14
相关论文
共 36 条
  • [1] Athreya A., 2017, J MACH LEARN RES, V18, P8393
  • [2] Multiple imputation by chained equations: what is it and how does it work?
    Azur, Melissa J.
    Stuart, Elizabeth A.
    Frangakis, Constantine
    Leaf, Philip J.
    [J]. INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) : 40 - 49
  • [3] STATISTICAL ANALYSIS OF FACTOR MODELS OF HIGH DIMENSION
    Bai, Jushan
    Li, Kunpeng
    [J]. ANNALS OF STATISTICS, 2012, 40 (01) : 436 - 465
  • [4] BHASKAR S. A., 2015, 2015 49th Annual Conference on Information Sciences and Systems (CISS), P1
  • [5] Covariate-assisted spectral clustering
    Binkiewicz, N.
    Vogelstein, J. T.
    Rohe, K.
    [J]. BIOMETRIKA, 2017, 104 (02) : 361 - 377
  • [6] The Power of Convex Relaxation: Near-Optimal Matrix Completion
    Candes, Emmanuel J.
    Tao, Terence
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (05) : 2053 - 2080
  • [7] MATRIX ESTIMATION BY UNIVERSAL SINGULAR VALUE THRESHOLDING
    Chatterjee, Sourav
    [J]. ANNALS OF STATISTICS, 2015, 43 (01) : 177 - 214
  • [8] Dunn PK., 2018, GEN LINEAR MODELS EX, DOI [10.1007/978-1-4419-0118-7, DOI 10.1007/978-1-4419-0118-7]
  • [9] Interlocking directorates in Irish companies using a latent space model for bipartite networks
    Friel, Nial
    Rastelli, Riccardo
    Wyse, Jason
    Raftery, Adrian E.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (24) : 6629 - 6634
  • [10] A Survey of Statistical Network Models
    Goldenberg, Anna
    Zheng, Alice X.
    Fienberg, Stephen E.
    Airoldi, Edoardo M.
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2010, 2 (02): : 129 - 233