A family of mixture models for biclustering

被引:0
|
作者
Tu, Wangshu [1 ]
Subedi, Sanjeena [1 ]
机构
[1] Carleton Univ, Sch Math & Stat, 4302 Herzberg Labs,1125 Colonel By Dr, Ottawa, ON K1S 5B6, Canada
关键词
AECM; biclustering; factor analysis; mixture models; model-based clustering; GENE-EXPRESSION DATA; FACTOR ANALYZERS; CLUSTER-ANALYSIS; CLASSIFICATION; LIKELIHOOD; ALGORITHM; DIMENSION;
D O I
10.1002/sam.11555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biclustering is used for simultaneous clustering of the observations and variables when there is no group structure known a priori. It is being increasingly used in bioinformatics, text analytics, and so on. Previously, biclustering has been introduced in a model-based clustering framework by utilizing a structure similar to a mixture of factor analyzers. In such models, observed variables X are modeled using a latent variable U that is assumed to be from N(0, I). Clustering of variables are introduced by imposing constraints on the entries of the factor loading matrix to be 0 and 1 that results in block diagonal covariance matrices. However, this approach is overly restrictive as off-diagonal elements in the blocks of the covariance matrices can only be 1 which can lead to unsatisfactory model fit on complex data. Here, the latent variable U is assumed to be from a N(0, T) where T is a diagonal matrix. This ensures that the off-diagonal terms in the block matrices within the covariance matrices are non-zero and not restricted to be 1. This leads to a superior model fit on complex data. A family of models is developed by imposing constraints on the components of the covariance matrix. For parameter estimation, an alternating expectation conditional maximization (AECM) algorithm is used. Finally, the proposed method is illustrated using simulated and real datasets.
引用
收藏
页码:206 / 224
页数:19
相关论文
共 50 条
  • [1] Hierarchical mixture models for biclustering in microarray data
    Martella, F.
    Alfo, M.
    Vichi, M.
    STATISTICAL MODELLING, 2011, 11 (06) : 489 - 505
  • [2] BARTMAP: A viable structure for biclustering
    Xu, Rui
    Wunsch, Donald C., II
    NEURAL NETWORKS, 2011, 24 (07) : 709 - 716
  • [3] Parsimonious Gaussian mixture models
    McNicholas, Paul David
    Murphy, Thomas Brendan
    STATISTICS AND COMPUTING, 2008, 18 (03) : 285 - 296
  • [4] Parsimonious Gaussian mixture models
    Paul David McNicholas
    Thomas Brendan Murphy
    Statistics and Computing, 2008, 18 : 285 - 296
  • [5] Finite mixture biclustering of discrete type multivariate data
    Fernandez, Daniel
    Arnold, Richard
    Pledger, Shirley
    Liu, Ivy
    Costilla, Roy
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (01) : 117 - 143
  • [6] Finite mixture biclustering of discrete type multivariate data
    Daniel Fernández
    Richard Arnold
    Shirley Pledger
    Ivy Liu
    Roy Costilla
    Advances in Data Analysis and Classification, 2019, 13 : 117 - 143
  • [7] Model-based classification using latent Gaussian mixture models
    McNicholas, Paul D.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (05) : 1175 - 1181
  • [8] On Bayesian Analysis of Parsimonious Gaussian Mixture Models
    Lu, Xiang
    Li, Yaoxiang
    Love, Tanzy
    JOURNAL OF CLASSIFICATION, 2021, 38 (03) : 576 - 593
  • [9] On Bayesian Analysis of Parsimonious Gaussian Mixture Models
    Xiang Lu
    Yaoxiang Li
    Tanzy Love
    Journal of Classification, 2021, 38 : 576 - 593
  • [10] Biclustering data analysis: a comprehensive survey
    Castanho, Eduardo N.
    Aidos, Helena
    Madeira, Sara C.
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)