Principal components in the discrimination of outliers: A study in simulation sample data corrected by Pearson's and Yates's chisquare distance

被引:4
作者
de Souza Veloso, Manoel Vitor [1 ]
Cirillo, Marcelo Angelo [2 ]
机构
[1] Univ Fed Alfenas, Inst Ciencias Sociais Aplicadas, Campus Avancado Varginha, Varginha, MG, Brazil
[2] Univ Fed Lavras, Dept Ciencias Exatas, Cx Postal 3037, BR-37200000 Lavras, MG, Brazil
关键词
contaminated samples; Monte Carlo; significance test; p-value; TIME-SERIES;
D O I
10.4025/actascitechnol.v28i2.26046
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Current study employs Monte Carlo simulation in the building of a significance test to indicate the principal components that best discriminate against outliers. Different sample sizes were generated by multivariate normal distribution with different numbers of variables and correlation structures. Corrections by chi-square distance of Pearson's and Yates's were provided for each sample size. Pearson's correlation test showed the best performance. By increasing the number of variables, significance probabilities in favor of hypothesis H-0 were reduced. So that the proposed method could be illustrated, a multivariate time series was applied with regard to sales volume rates in the state of Minas Gerais, obtained in different market segments.
引用
收藏
页码:193 / 200
页数:8
相关论文
共 23 条
[1]  
[Anonymous], 2014, The R Foundation for Statistical Computing
[2]   A concentration study of principal components [J].
Bénasséni, J .
JOURNAL OF APPLIED STATISTICS, 2005, 32 (09) :947-957
[3]   Outlier detection by robust principal components analysis [J].
Caroni, C .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2000, 29 (01) :139-151
[4]   Robust detection of multiple outliers in grouped multivariate data [J].
Caroni, Chrys ;
Billor, Nedret .
JOURNAL OF APPLIED STATISTICS, 2007, 34 (10) :1241-1250
[5]  
Chan W. S., 1992, J APPL STAT, V19, P17
[6]  
Chen JH, 2008, STAT SINICA, V18, P443
[7]   Inference for multivariate normal mixtures [J].
Chen, Jiahua ;
Tan, Xianming .
JOURNAL OF MULTIVARIATE ANALYSIS, 2009, 100 (07) :1367-1383
[8]   A clustering approach to interpretable principal components [J].
Enki, Doyo G. ;
Trendafilov, Nickolay T. ;
Jolliffe, Ian T. .
JOURNAL OF APPLIED STATISTICS, 2013, 40 (03) :583-599
[9]  
Filzmoser P, 2005, AUST J STAT, V34, P127
[10]   Outlier identification in high dimensions [J].
Filzmoser, Peter ;
Maronna, Ricardo ;
Werner, Mark .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) :1694-1711