Missing Values and Directional Outlier Detection in Model-Based Clustering

被引:1
|
作者
Tong, Hung [1 ]
Tortora, Cristina [2 ]
机构
[1] Univ Alabama, Tuscaloosa, AL 35487 USA
[2] San Jose State Univ, San Jose, CA 95192 USA
基金
美国国家科学基金会;
关键词
Model-based clustering; Outliers; Missing data; Contaminated normal distribution; Multiple scaled distributions; EM algorithm; MAXIMUM-LIKELIHOOD-ESTIMATION; MIXTURE-MODELS; PARSIMONIOUS MIXTURES; DISCRIMINANT-ANALYSIS; SIMULATING DATA; INCOMPLETE DATA; EM ALGORITHM; R PACKAGE; MULTIVARIATE; SELECTION;
D O I
10.1007/s00357-023-09450-2
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Model-based clustering tackles the task of uncovering heterogeneity in a data set to extract valuable insights. Given the common presence of outliers in practice, robust methods for model-based clustering have been proposed. However, the use of many methods in this area becomes severely limited in applications where partially observed records are common since their existing frameworks often assume complete data only. Here, a mixture of multiple scaled contaminated normal (MSCN) distributions is extended using the expectation-conditional maximization (ECM) algorithm to accommodate data sets with values missing at random. The newly proposed extension preserves the mixture's capability in yielding robust parameter estimates and performing automatic outlier detection separately for each principal component. In this fitting framework, the MSCN marginal density is approximated using the inversion formula for the characteristic function. Extensive simulation studies involving incomplete data sets with outliers are conducted to evaluate parameter estimates and to compare clustering performance and outlier detection of our model to other mixtures.
引用
收藏
页码:480 / 513
页数:34
相关论文
共 50 条
  • [11] A Model-Based Approach for Outlier Detection in Sensor Networks
    Ding, Min
    Liang, Qilian
    Cheng, Xiuzhen
    Al-Rodhaan, Mznah
    Al-Dhelaan, Abdullah
    Huang, Scott C. -H.
    Chen, Dechang
    AD HOC & SENSOR WIRELESS NETWORKS, 2011, 12 (3-4) : 275 - 293
  • [12] A Mixture Model-Based Combination Approach for Outlier Detection
    Bouguessa, Mohamed
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2014, 23 (04)
  • [13] Model-Based Outlier Detection System with Statistical Preprocessing
    Singh, Asir Antony Gnana
    Leavline, Jebalamar
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2016, 15 (01) : 789 - 801
  • [14] Cloud model-based outlier detection algorithm for categorical data
    Lei, Dajiang
    Zhang, Liping
    Zhang, Lisheng
    Lei, D. (leidj@cqupt.edu.cn), 1600, Science and Engineering Research Support Society, 20 Virginia Court, Sandy Bay, Tasmania, Australia (06): : 199 - 214
  • [15] Model-based Outlier Detection for Object-Relational Data
    Riahi, Fatemeh
    Schulte, Oliver
    2015 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2015, : 1590 - 1598
  • [16] Model-based outlier detection method for time series of process industry
    Su, Weixing
    Zhu, Yunlong
    Hu, Kunyuan
    Liu, Fang
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2012, 33 (09): : 2080 - 2087
  • [17] Bayesian model-based outlier detection in network meta-analysis
    Metelli, Silvia
    Mavridis, Dimitris
    Crequit, Perrine
    Chaimani, Anna
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2023, 186 (04) : 754 - 771
  • [18] Failure Analysis for Model-Based Organ Segmentation Using Outlier Detection
    Saalbach, Axel
    Stehle, Irina Waechter
    Lorenz, Cristian
    Weese, Juergen
    MEDICAL IMAGING 2014: IMAGE PROCESSING, 2014, 9034
  • [19] Subtractive Clustering Based RBF Neural Network Model for Outlier Detection
    Yang, Peng
    Zhu, Qingsheng
    Zhong, Xun
    JOURNAL OF COMPUTERS, 2009, 4 (08) : 755 - 762
  • [20] Model-Based Clustering
    Paul D. McNicholas
    Journal of Classification, 2016, 33 : 331 - 373