Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data

被引：1

作者：

Amiri, Leila ^{[1
]}

Khazaei, Mojtaba ^{[1
]}

Ganjali, Mojtaba ^{[1
]}

机构：

[1] Shahid Beheshti Univ, Dept Stat, Tehran, Iran

来源：

JOURNAL OF APPLIED STATISTICS | 2019年 / 46卷 / 11期

关键词：

Mixture models; mixed type data; general location model; factor analysis; model-based clustering; the ECM algorithm; DISCRIMINANT-ANALYSIS; ELEMENT CONTENTS; CLASSIFICATION; VARIABLES; ALGORITHM; BINARY;

D O I：

10.1080/02664763.2019.1579307

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented.

引用

页码：2075 / 2100

页数：26

共 50 条

[31] Hybrid data labeling algorithm for clustering large mixed type data
Sangam, Ravi Sankar
Om, Hari
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2015, 45 (02) : 273 - 293
[32] Hybrid data labeling algorithm for clustering large mixed type data
Ravi Sankar Sangam
Hari Om
Journal of Intelligent Information Systems, 2015, 45 : 273 - 293
[33] Mixtures of Gaussian copula factor analyzers for clustering high dimensional data
Lili Zhang
Jangsun Baek
Journal of the Korean Statistical Society, 2019, 48 : 480 - 492
[34] Robust clustering via mixtures of t factor analyzers with incomplete data
Wang, Wan-Lun
Lin, Tsung-, I
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2022, 16 (03) : 659 - 690
[35] Robust clustering via mixtures of t factor analyzers with incomplete data
Wan-Lun Wang
Tsung-I Lin
Advances in Data Analysis and Classification, 2022, 16 : 659 - 690
[36] kamila: Clustering Mixed-Type Data in R and Hadoop
Foss, Alexander H.
Markatou, Marianthi
JOURNAL OF STATISTICAL SOFTWARE, 2018, 83 (13): : 1 - 44
[37] K-centers algorithm for clustering mixed type data
Zhao, Wei-Dong
Dai, Wei-Hui
Tang, Chun-Bin
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 1140 - +
[38] Using Link-Based Consensus Clustering for Mixed-Type Data Analysis
Boongoen, Tossapon
Iam-On, Natthakan
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (01): : 1993 - 2011
[39] Clustering large mixed-type data with ordinal variables
Szepannek, Gero
Aschenbruck, Rabea
Wilhelm, Adalbert
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2024,
[40] Distance Metrics and Clustering Methods for Mixed-type Data
Foss, Alexander H.
Markatou, Marianthi
Ray, Bonnie
INTERNATIONAL STATISTICAL REVIEW, 2019, 87 (01) : 80 - 109

← 1 2 3 4 5 →