Multivariate outlier detection in incomplete survey data:: the epidemic algorithm and transformed rank correlations

被引:8
|
作者
Béguin, C
Hulliger, B [1 ]
机构
[1] Swiss Fed Stat Off, Stat Methods Unit, CH-2010 Neuchatel, Switzerland
[2] Univ Neuchatel, CH-2000 Neuchatel, Switzerland
关键词
data depth; missing value; multivariate data; outlier detection; robustness; sampling weight;
D O I
10.1046/j.1467-985X.2003.00753.x
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
As a part of the EUREDIT project new methods to detect multivariate outliers in incomplete survey data have been developed. These methods are the first to work with sampling weights and to be able to cope with missing values. Two of these methods are presented here. The epidemic algorithm simulates the propagation of a disease through a population and uses extreme infection times to find outlying observations. Transformed rank correlations are robust estimates of the centre and the scatter of the data. They use a geometric transformation that is based on the rank correlation matrix. The estimates are used to define a Mahalanobis distance that reveals outliers. The two methods are applied to a small data set and to one of the evaluation data sets of the EUREDIT project.
引用
收藏
页码:275 / 294
页数:20
相关论文
共 50 条
  • [1] The BACON-EEM algorithm for multivariate outlier detection in incomplete survey data
    Beguin, Cedric
    Hulliger, Beat
    SURVEY METHODOLOGY, 2008, 34 (01) : 91 - 103
  • [2] Application of Transformed Prediction Ellipsoids for Outlier Detection in Multivariate Non-Gaussian Data
    Prykhodko, Sergiy
    Makarova, Lidiia
    Prykhodko, Kateryna
    Pukhalevych, Andrii
    15TH INTERNATIONAL CONFERENCE ON ADVANCED TRENDS IN RADIOELECTRONICS, TELECOMMUNICATIONS AND COMPUTER ENGINEERING (TCSET - 2020), 2020, : 359 - 362
  • [3] Detection of multivariate outliers in business survey data with incomplete information
    Valentin Todorov
    Matthias Templ
    Peter Filzmoser
    Advances in Data Analysis and Classification, 2011, 5 : 37 - 56
  • [4] Detection of multivariate outliers in business survey data with incomplete information
    Todorov, Valentin
    Templ, Matthias
    Filzmoser, Peter
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2011, 5 (01) : 37 - 56
  • [5] Outlier detection for multivariate categorical data
    Puig, Xavier
    Ginebra, Josep
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2018, 34 (07) : 1400 - 1412
  • [6] Outlier detection in multivariate hydrologic data
    Kirk, Adam J.
    McCuen, Richard H.
    JOURNAL OF HYDROLOGIC ENGINEERING, 2008, 13 (07) : 641 - 646
  • [7] Outlier detection in multivariate analytical chemical data
    Egan, WJ
    Mogan, SL
    ANALYTICAL CHEMISTRY, 1998, 70 (11) : 2372 - 2379
  • [8] A Novel Outlier Detection Method for Multivariate Data
    Almardeny, Yahya
    Boujnah, Noureddine
    Cleary, Frances
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (09) : 4052 - 4062
  • [9] Multivariate Functional Data Visualization and Outlier Detection
    Dai, Wenlin
    Genton, Marc G.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (04) : 923 - 934
  • [10] A Novel Approach for Outlier Detection in Multivariate Data
    Afzal, Saima
    Afzal, Ayesha
    Amin, Muhammad
    Saleem, Sehar
    Ali, Nouman
    Sajid, Muhammad
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021