Nonlinear principal component analysis of noisy data

被引:34
作者
Hsieh, William W. [1 ]
机构
[1] Univ British Columbia, Dept Earth & Ocean Sci, Vancouver, BC V6T 1Z4, Canada
关键词
nonlinear principal component analysis; information criterion; model selection; autoassociative neural network; regularization; El Nino; ENSO; quasibiennial oscillation;
D O I
10.1016/j.neunet.2007.04.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With very noisy data, having plentiful samples eliminates overfitting in nonlinear regression, but not in nonlinear principal component analysis (NLPCA). To overcome this problem in NLPCA, a new information criterion (IC) is proposed for selecting the best model among multiple models with different complexity and regularization (i.e. weight penalty). This IC gauges the inconsistency 1 between the nonlinear principal components (u and u) for every data point x and its nearest neighbour x, with 1 = 1 - correlation(u, u), where I tends to increase with overfilled solutions. Tests were performed using autoassociative neural networks for NLPCA on synthetic and real climate data (tropical Pacific sea surface temperatures and equatorial stratospheric winds), with the IC performing well in model selection and in deciding between an open curve or a closed curve solution. (C) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:434 / 443
页数:10
相关论文
共 26 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
Bishop CM., 1995, Neural networks for pattern recognition
[3]  
Cherkassky V, 2007, LEARNING DATA CONCEP
[4]   The shortcomings of nonlinear principal component analysis in identifying circulation regimes [J].
Christiansen, B .
JOURNAL OF CLIMATE, 2005, 18 (22) :4814-4823
[5]   Comments on "The shortcomings of nonlinear principal component analysis in identifying circulation regimes" - Reply [J].
Christiansen, Bo .
JOURNAL OF CLIMATE, 2007, 20 (02) :378-379
[6]  
Foresee F.D, 1997, P 1997 INT JOINT C N
[7]  
Friedman J, 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[8]   Representation of the quasi-biennial oscillation in the tropical stratospheric wind by nonlinear principal component analysis [J].
Hamilton, K ;
Hsieh, WW .
JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES, 2002, 107 (D15) :ACL3-1
[9]   PRINCIPAL CURVES [J].
HASTIE, T ;
STUETZLE, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (406) :502-516
[10]   Nonlinear multivariate and time series analysis by neural network methods [J].
Hsieh, WW .
REVIEWS OF GEOPHYSICS, 2004, 42 (01) :RG10031-25