Missing-data theory in the context of exploratory data analysis

被引:24
作者
Camacho, Jose [1 ]
机构
[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain
关键词
Exploratory data analysis; Missing data; Data understanding; Latent structures; Correlation matrix; Rotation; Variable selection; PRINCIPAL COMPONENTS; FRAMEWORK; PLS;
D O I
10.1016/j.chemolab.2010.04.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a new method for exploratory analysis and the interpretation of latent structures. The approach is named missing-data methods for exploratory data analysis (MEDA). The MEDA approach can be applied in combination with several models, including Principal Components Analysis (PCA), Factor Analysis (FA) and Partial Least Squares (PLS). It can be seen as a substitute of rotation methods with better properties associated: it is more accurate than rotation methods in the detection of relationships between pairs of variables, it is robust to the overestimation of the number of PCs and it does not depend on the normalization of the loadings. MEDA is useful to infer the structure in the data and also to interpret the contribution of each latent variable. The interpretation of PLS models with MEDA, including variables selection, may be specially valuable for the chemometrics community. The use of MEDA with PCA and PLS models is demonstrated with several simulated and real examples. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:8 / 18
页数:11
相关论文
共 24 条
[1]  
Alaa E., 2005, LECT NOTES COMPUTER, P935
[2]   Applications of maximum likelihood principal component analysis: incomplete data sets and calibration transfer [J].
Andrews, DT ;
Wentzell, PD .
ANALYTICA CHIMICA ACTA, 1997, 350 (03) :341-352
[3]  
[Anonymous], 2002, Principal components analysis
[4]  
[Anonymous], J QUALITY TECHNOLOGY
[5]  
[Anonymous], 2003, User's Guide to Principal Components
[6]   Dealing with missing data in MSPC: several methods, different interpretations, some examples [J].
Arteaga, F ;
Ferrer, A .
JOURNAL OF CHEMOMETRICS, 2002, 16 (8-10) :408-418
[7]   Framework for regression-based missing data imputation methods in on-line MSPC [J].
Arteaga, F ;
Ferrer, A .
JOURNAL OF CHEMOMETRICS, 2005, 19 (08) :439-447
[8]  
CAMACHO J, CHEMOMETRICS I UNPUB
[9]   Data understanding with PCA: Structural and Variance Information plots [J].
Camacho, Jose ;
Pico, Jesus ;
Ferrer, Alberto .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 100 (01) :48-56
[10]  
Costello A. B., 2005, PRACTICAL ASSESSMENT, V10, P7, DOI [DOI 10.7275/JYJ1-4868, 10.7275/jyj1-4868]