Scalable non-negative matrix tri-factorization

被引:8
作者
Copar, Andrej [1 ]
Zitnik, Marinka [1 ,2 ]
Zupan, Blaz [1 ,3 ]
机构
[1] Univ Ljubljana, Fac Comp & Informat Sci, Ljubljana, Slovenia
[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[3] Baylor Coll Med, Houston, TX 77030 USA
来源
BIODATA MINING | 2017年 / 10卷
关键词
Matrix factorization; Non-negative matrix tri-factorization; Non-negative block value decomposition; Block-wise multiplication; Graphics-processing unit; Large scale latent factor analysis; CANCER;
D O I
10.1186/s13040-017-0160-6
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Matrix factorization is a well established pattern discovery tool that has seen numerous applications in biomedical data analytics, such as gene expression co-clustering, patient stratification, and gene-disease association mining. Matrix factorization learns a latent data model that takes a data matrix and transforms it into a latent feature space enabling generalization, noise removal and feature discovery. However, factorization algorithms are numerically intensive, and hence there is a pressing challenge to scale current algorithms to work with large datasets. Our focus in this paper is matrix tri-factorization, a popular method that is not limited by the assumption of standard matrix factorization about data residing in one latent space. Matrix tri-factorization solves this by inferring a separate latent space for each dimension in a data matrix, and a latent mapping of interactions between the inferred spaces, making the approach particularly suitable for biomedical data mining. Results: We developed a block-wise approach for latent factor learning in matrix tri-factorization. The approach partitions a data matrix into disjoint submatrices that are treated independently and fed into a parallel factorization system. An appealing property of the proposed approach is its mathematical equivalence with serial matrix tri-factorization. In a study on large biomedical datasets we show that our approach scales well on multi-processor and multi-GPU architectures. On a four-GPU system we demonstrate that our approach can be more than 100-times faster than its single-processor counterpart. Conclusions: A general approach for scaling non-negative matrix tri-factorization is proposed. The approach is especially useful parallel matrix factorization implemented in a multi-GPU environment. We expect the new approach will be useful in emerging procedures for latent factor analysis, notably for data integration, where many large data matrices need to be collectively factorized.
引用
收藏
页数:16
相关论文
共 46 条
  • [1] Non-negative matrix factorization of multimodal MRI, fMRI and phenotypic data reveals differential changes in default mode subnetworks in ADHD
    Anderson, Ariana
    Douglas, Pamela K.
    Kerr, Wesley T.
    Haynes, Virginia S.
    Yuille, Alan L.
    Xie, Jianwen
    Wu, Ying Nian
    Brown, Jesse A.
    Cohen, Mark S.
    [J]. NEUROIMAGE, 2014, 102 : 207 - 219
  • [2] [Anonymous], PROC 11 EUR PVM MPI, DOI DOI 10.1007/978-3-540-30218-6_19
  • [3] [Anonymous], 2012 IEEE 18 INT C P, DOI DOI 10.1109/ICPADS.2012.97
  • [4] Benson AR, ADV NEURAL INFORM PR
  • [5] FAT4 functions as a tumour suppressor in gastric cancer by modulating Wnt/β-catenin signalling
    Cai, Jian
    Feng, Dan
    Hu, Liang
    Chen, Haiyang
    Yang, Guangzhen
    Cai, Qingping
    Gao, Chunfang
    Wei, Dong
    [J]. BRITISH JOURNAL OF CANCER, 2015, 113 (12) : 1720 - 1729
  • [6] Collaborative filtering using orthogonal nonnegative matrix tri-factorization
    Chen, Gang
    Wang, Fei
    Zhang, Changshui
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (03) : 368 - 379
  • [7] Copar A, CROW FAST NONNEGATIV
  • [8] Parallel distributed computing using Python']Python
    Dalcin, Lisandro D.
    Paz, Rodrigo R.
    Kler, Pablo A.
    Cosimo, Alejandro
    [J]. ADVANCES IN WATER RESOURCES, 2011, 34 (09) : 1124 - 1139
  • [9] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
  • [10] Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology
    Devarajan, Karthik
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (07)