Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data

被引:1
|
作者
Amiri, Leila [1 ]
Khazaei, Mojtaba [1 ]
Ganjali, Mojtaba [1 ]
机构
[1] Shahid Beheshti Univ, Dept Stat, Tehran, Iran
关键词
Mixture models; mixed type data; general location model; factor analysis; model-based clustering; the ECM algorithm; DISCRIMINANT-ANALYSIS; ELEMENT CONTENTS; CLASSIFICATION; VARIABLES; ALGORITHM; BINARY;
D O I
10.1080/02664763.2019.1579307
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.
引用
收藏
页码:2075 / 2100
页数:26
相关论文
共 50 条
  • [41] Clustering of Mixed-Type Data Considering Concept Hierarchies
    Behzadi, Sahar
    Mueller, Nikola S.
    Plant, Claudia
    Boehm, Christian
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 555 - 573
  • [42] Robust clustering of multiply censored data via mixtures of t factor analyzers
    Wan-Lun Wang
    Tsung-I Lin
    TEST, 2022, 31 : 22 - 53
  • [43] A Bayesian Factor Model for Spatial Panel Data with a Separable Covariance Approach
    Leorato, Samantha
    Mezzetti, Maura
    BAYESIAN ANALYSIS, 2021, 16 (02): : 489 - 519
  • [44] Robust clustering of multiply censored data via mixtures of t factor analyzers
    Wang, Wan-Lun
    Lin, Tsung-, I
    TEST, 2022, 31 (01) : 22 - 53
  • [45] Hierarchical clustering of mixed-type data based on barycentric coding
    Moschidis O.
    Markos A.
    Chadjipadelis T.
    Behaviormetrika, 2023, 50 (1) : 465 - 489
  • [46] Imputation Strategies for Clustering Mixed-Type Data with Missing Values
    Rabea Aschenbruck
    Gero Szepannek
    Adalbert F. X. Wilhelm
    Journal of Classification, 2023, 40 : 2 - 24
  • [47] Imputation Strategies for Clustering Mixed-Type Data with Missing Values
    Aschenbruck, Rabea
    Szepannek, Gero
    Wilhelm, Adalbert F. X.
    JOURNAL OF CLASSIFICATION, 2023, 40 (01) : 2 - 24
  • [48] Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data
    Baek, Jangsun
    McLachlan, Geoffrey J.
    Flack, Lloyd K.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) : 1298 - 1309
  • [49] MCF Tree-Based Clustering Method for Very Large Mixed-Type Data Set
    Ryu, Hyeong-Cheol
    Jung, Sungwon
    IEEE ACCESS, 2021, 9 : 138580 - 138597
  • [50] Clustering multivariate data using factor analytic Bayesian mixtures with an unknown number of components
    Papastamoulis, Panagiotis
    STATISTICS AND COMPUTING, 2020, 30 (03) : 485 - 506