Structured Matrix Completion with Applications to Genomic Data Integration

被引:51
作者
Cai, Tianxi [1 ]
Cai, T. Tony [1 ]
Zhang, Anru [1 ]
机构
[1] Univ Penn, Dept Stat, Wharton Sch, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Constrained minimization; Genomic data integration; Low-rank matrix; Matrix completion; Singular value decomposition; Structured matrix completion; LOW-RANK MATRIX; MISSING VALUE ESTIMATION; GENE-EXPRESSION DATA; OVARIAN-CANCER; GENOTYPE IMPUTATION; PENALIZATION; ALGORITHM; MODEL;
D O I
10.1080/01621459.2015.1021005
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on-independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured rnissingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online.
引用
收藏
页码:621 / 633
页数:13
相关论文
共 44 条
  • [1] [Anonymous], 2011, Advances in Neural Information Processing Systems
  • [2] [Anonymous], 2010, INTRO NONASYMPTOTIC
  • [3] [Anonymous], 2010, ARXIV10012738
  • [4] Convex multi-task feature learning
    Argyriou, Andreas
    Evgeniou, Theodoros
    Pontil, Massimiliano
    [J]. MACHINE LEARNING, 2008, 73 (03) : 243 - 272
  • [5] Integrated genomic analyses of ovarian carcinoma
    Bell, D.
    Berchuck, A.
    Birrer, M.
    Chien, J.
    Cramer, D. W.
    Dao, F.
    Dhir, R.
    DiSaia, P.
    Gabra, H.
    Glenn, P.
    Godwin, A. K.
    Gross, J.
    Hartmann, L.
    Huang, M.
    Huntsman, D. G.
    Iacocca, M.
    Imielinski, M.
    Kalloger, S.
    Karlan, B. Y.
    Levine, D. A.
    Mills, G. B.
    Morrison, C.
    Mutch, D.
    Olvera, N.
    Orsulic, S.
    Park, K.
    Petrelli, N.
    Rabeno, B.
    Rader, J. S.
    Sikic, B. I.
    Smith-McCune, K.
    Sood, A. K.
    Bowtell, D.
    Penny, R.
    Testa, J. R.
    Chang, K.
    Dinh, H. H.
    Drummond, J. A.
    Fowler, G.
    Gunaratne, P.
    Hawes, A. C.
    Kovar, C. L.
    Lewis, L. R.
    Morgan, M. B.
    Newsham, I. F.
    Santibanez, J.
    Reid, J. G.
    Trevino, L. R.
    Wu, Y. -Q.
    Wang, M.
    [J]. NATURE, 2011, 474 (7353) : 609 - 615
  • [6] Patterns of gene expression that characterize long-term survival in advanced stage serous ovarian cancers
    Berchuck, A
    Iversen, ES
    Lancaster, JM
    Pittman, J
    Luo, JQ
    Lee, P
    Murphy, S
    Dressman, HK
    Febbo, PG
    West, M
    Nevins, JR
    Marks, JR
    [J]. CLINICAL CANCER RESEARCH, 2005, 11 (10) : 3686 - 3696
  • [7] Biswas P, 2006, ACM T SENSOR NETWORK, V2
  • [8] Expression profiling of serous low malignant potential, low-grade, and high-grade tumors of the ovary.
    Bonome, T
    Lee, JY
    Park, DC
    Radonovich, M
    Pise-Masison, C
    Brady, J
    Gardner, GJ
    Hao, K
    Wong, WH
    Barrett, JC
    Lu, KH
    Sood, AK
    Gershenson, DM
    Mok, SC
    Birrer, MJ
    [J]. CANCER RESEARCH, 2005, 65 (22) : 10602 - 10612
  • [9] A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals
    Browning, Brian L.
    Browning, Sharon R.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) : 210 - 223
  • [10] A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
    Cai, Jian-Feng
    Candes, Emmanuel J.
    Shen, Zuowei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) : 1956 - 1982