Data fusion using factor analysis and low-rank matrix completion

被引:0
|
作者
Ahfock, Daniel [1 ]
Pyne, Saumyadipta [2 ,3 ,4 ]
McLachlan, Geoffrey J. [1 ]
机构
[1] Univ Queensland, Sch Math & Phys, Brisbane, Qld, Australia
[2] Univ Pittsburgh, Grad Sch Publ Hlth, Publ Hlth Dynam Lab, Pittsburgh, PA USA
[3] Univ Pittsburgh, Grad Sch Publ Hlth, Dept Biostat, Pittsburgh, PA 15261 USA
[4] Hlth Analyt Network, Pittsburgh, PA USA
基金
澳大利亚研究理事会;
关键词
Data fusion; Statistical file-matching; Low-rank matrix completion; Factor analysis; ALGORITHMS; NUMBER;
D O I
10.1007/s11222-021-10033-7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data fusion involves the integration of multiple related datasets. The statistical file-matching problem is a canonical data fusion problem in multivariate analysis, where the objective is to characterise the joint distribution of a set of variables when only strict subsets of marginal distributions have been observed. Estimation of the covariance matrix of the full set of variables is challenging given the missing-data pattern. Factor analysis models use lower-dimensional latent variables in the data-generating process, and this introduces low-rank components in the complete-data matrix and the population covariance matrix. The low-rank structure of the factor analysis model can be exploited to estimate the full covariance matrix from incomplete data via low-rank matrix completion. We prove the identifiability of the factor analysis model in the statistical file-matching problem under conditions on the number of factors and the number of shared variables over the observed marginal subsets. Additionally, we provide an EM algorithm for parameter estimation. On several real datasets, the factor model gives smaller reconstruction errors in file-matching problems than the common approaches for low-rank matrix completion.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Low-Rank Matrix Completion with Geometric Performance Guarantees
    Dai, Wei
    Kerman, Ely
    Milenkovic, Olgica
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 3740 - 3743
  • [42] Local low-rank approach to nonlinear matrix completion
    Ryohei Sasaki
    Katsumi Konishi
    Tomohiro Takahashi
    Toshihiro Furukawa
    EURASIP Journal on Advances in Signal Processing, 2021
  • [43] Solving Low-Rank Matrix Completion Problems Efficiently
    Goldfarb, Donald
    Ma, Shiqian
    Wen, Zaiwen
    2009 47TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1 AND 2, 2009, : 1013 - 1020
  • [44] LOW-RANK MATRIX COMPLETION FOR ARRAY SIGNAL PROCESSING
    Weng, Zhiyuan
    Wang, Xin
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2697 - 2700
  • [45] Low-rank traffic matrix completion with marginal information
    Xiong, Zikai
    Wei, Yimin
    Xu, Renjie
    Xu, Yanwei
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2022, 410
  • [46] PARALLEL MATRIX FACTORIZATION FOR LOW-RANK TENSOR COMPLETION
    Xu, Yangyang
    Hao, Ruru
    Yin, Wotao
    Su, Zhixun
    INVERSE PROBLEMS AND IMAGING, 2015, 9 (02) : 601 - 624
  • [47] Local low-rank approach to nonlinear matrix completion
    Sasaki, Ryohei
    Konishi, Katsumi
    Takahashi, Tomohiro
    Furukawa, Toshihiro
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [48] Relaxed leverage sampling for low-rank matrix completion
    Kundu, Abhisek
    INFORMATION PROCESSING LETTERS, 2017, 124 : 6 - 9
  • [49] The Algebraic Combinatorial Approach for Low-Rank Matrix Completion
    Kiraly, Franz J.
    Theran, Louis
    Tomioka, Ryota
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 1391 - 1436
  • [50] Information Theoretic Bounds for Low-Rank Matrix Completion
    Vishwanath, Sriram
    2010 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2010, : 1508 - 1512