The BACON-EEM algorithm for multivariate outlier detection in incomplete survey data

被引:0
|
作者
Beguin, Cedric [1 ]
Hulliger, Beat [2 ]
机构
[1] Univ Neuchatel, CH-2010 Neuchatel, Switzerland
[2] Univ Appl Sci NW Switzerland, CH-4600 Olten, Switzerland
关键词
forward search method; outlier detection; multivariate data; missing value; sampling; robustness; E-M algorithm;
D O I
暂无
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
With complete multivariate data the BACON algorithm (Billor, Hadi and Vellemann 2000) yields a robust estimate of the covariance matrix. The corresponding Mahalanobis distance may be used for multivariate outlier detection. When items are missing the EM algorithm is a convenient way to estimate the covariance matrix at each iteration step of the BACON algorithm. In finite population sampling the EM algorithm must be enhanced to estimate the covariance matrix of the population rather than of the sample. A version of the EM algorithm for survey data following a multivariate normal model, the EEM algorithm (Estimated Expectation Maximization), is proposed. The combination of the two algorithms, the BACON-EEM algorithm, is applied to two datasets and compared with alternative methods.
引用
收藏
页码:91 / 103
页数:13
相关论文
共 50 条
  • [31] A Cluster-Based Outlier Detection Scheme for Multivariate Data
    Jobe, J. Marcus
    Pokojovy, Michael
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (512) : 1543 - 1551
  • [32] Multivariate outlier detection applied to multiply imputed laboratory data
    Penny, KI
    Jolliffe, IT
    STATISTICS IN MEDICINE, 1999, 18 (14) : 1879 - 1895
  • [33] Outlier detection based on multisource information fusion in incomplete mixed data
    Li, Ran
    Chen, Hongchang
    Liu, Shuxin
    Wang, Kai
    Liu, Shuo
    Su, Zhe
    APPLIED SOFT COMPUTING, 2024, 165
  • [34] A Survey for Different Approaches of Outlier Detection in Data Mining
    Chandarana, Dhaval R.
    Dhamecha, Maulik V.
    2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, SIGNALS, COMMUNICATION AND OPTIMIZATION (EESCO), 2015,
  • [35] A survey of outlier detection in high dimensional data streams
    Souiden, Imen
    Omri, Mohamed Nazih
    Brahmi, Zaki
    COMPUTER SCIENCE REVIEW, 2022, 44
  • [36] A Hybrid Clustering Algorithm for Outlier Detection in Data Streams
    Vijayarani, S.
    Jothi, P.
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (11): : 285 - 295
  • [37] A simple and effective outlier detection algorithm for categorical data
    Xingwang Zhao
    Jiye Liang
    Fuyuan Cao
    International Journal of Machine Learning and Cybernetics, 2014, 5 : 469 - 477
  • [38] Big Data Outlier Detection Algorithm Based on Grid
    Guo Wei-Wei
    Liu Feng
    2018 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2018), 2018, : 274 - 277
  • [39] A Data Stream Outlier Detection Algorithm Based on Grid
    Yu Xiang
    Lei Guohua
    Xu Xiandong
    Lin Liandong
    2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 4136 - 4141
  • [40] Cluster Based Outlier Detection Algorithm For Healthcare Data
    Christy, A.
    MeeraGandhi, G.
    Vaithyasubramanian, S.
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 209 - 215