Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data

被引:1
|
作者
Amiri, Leila [1 ]
Khazaei, Mojtaba [1 ]
Ganjali, Mojtaba [1 ]
机构
[1] Shahid Beheshti Univ, Dept Stat, Tehran, Iran
关键词
Mixture models; mixed type data; general location model; factor analysis; model-based clustering; the ECM algorithm; DISCRIMINANT-ANALYSIS; ELEMENT CONTENTS; CLASSIFICATION; VARIABLES; ALGORITHM; BINARY;
D O I
10.1080/02664763.2019.1579307
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.
引用
收藏
页码:2075 / 2100
页数:26
相关论文
共 50 条
  • [21] A combined multilevel factor analysis and covariance regression model with mixed effects in the mean and variance structure
    Orindi, Benedict
    Quintero, Adrian
    Bruyneel, Luk
    Li, Baoyue
    Lesaffre, Emmanuel
    STATISTICS IN MEDICINE, 2023, 42 (18) : 3128 - 3144
  • [22] Spectral Clustering of Mixed-Type Data
    Mbuga, Felix
    Tortora, Cristina
    STATS, 2022, 5 (01): : 1 - 11
  • [23] Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete-data
    Wei, Yuhong
    Tang, Yang
    McNicholas, Paul D.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 130 : 18 - 41
  • [24] Model-Based Clustering of Mixed Data With Sparse Dependence
    Choi, Young-Geun
    Ahn, Soohyun
    Kim, Jayoun
    IEEE ACCESS, 2023, 11 : 75945 - 75954
  • [25] Model-based clustering of Gaussian copulas for mixed data
    Marbac, Matthieu
    Biernacki, Christophe
    Vandewalle, Vincent
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (23) : 11635 - 11656
  • [26] Clustering Data of Mixed Categorical and Numerical Type With Unsupervised Feature Learning
    Lam, Dao
    Wei, Mingzhen
    Wunsch, Donald
    IEEE ACCESS, 2015, 3 : 1605 - 1613
  • [27] Clustering Mixed-Type Data with Correlation-Preserving Embedding
    Tran, Luan
    Fan, Liyue
    Shahabi, Cyrus
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 342 - 358
  • [28] Model based clustering for mixed data: clustMD
    McParland, Damien
    Gormley, Isobel Claire
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2016, 10 (02) : 155 - 169
  • [29] Model-based clustering of functional data via mixtures of t distributions
    Anton, Cristina
    Smith, Iain
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2024, 18 (03) : 563 - 595
  • [30] Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers
    Viroli, Cinzia
    JOURNAL OF CLASSIFICATION, 2010, 27 (03) : 363 - 388