PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data

被引:22
作者
Mejia, Amanda F. [1 ]
Nebel, Mary Beth [2 ]
Eloyan, Ani [3 ]
Caffo, Brian [4 ]
Lindquist, Martin A. [4 ]
机构
[1] Indiana Univ, Dept Stat, Bloomington, IN USA
[2] Kennedy Krieger Inst, Ctr Neurodev & Imaging Res, Baltimore, MD USA
[3] Brown Univ, Dept Biostat, Providence, RI 02912 USA
[4] Johns Hopkins Univ, Dept Biostat, Baltimore, MD 21205 USA
关键词
fMRI; High-dimensional statistics; Image analysis; Leverage; Outlier detection; Principal component analysis; Robust statistics; MOTION ARTIFACT; ROBUST;
D O I
10.1093/biostatistics/kxw050
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis.
引用
收藏
页码:521 / 536
页数:16
相关论文
共 28 条
[1]  
[Anonymous], 1990, Applied Linear Statistical Models: Regression, Analysis of Variance, and Experimental Designs
[2]  
[Anonymous], TECHNICAL REPORT
[3]  
Beckmann CF., 2009, Neuroimage, V47, pS148, DOI [10.1016/S1053-8119(09)71511-3, DOI 10.1016/S1053-8119(09)71511-3]
[4]   DIAGNOSTICS FOR PRINCIPAL COMPONENTS - INFLUENCE FUNCTIONS AS DIAGNOSTIC-TOOLS [J].
BROOKS, SP .
STATISTICIAN, 1994, 43 (04) :483-494
[5]   The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism [J].
Di Martino, A. ;
Yan, C-G ;
Li, Q. ;
Denio, E. ;
Castellanos, F. X. ;
Alaerts, K. ;
Anderson, J. S. ;
Assaf, M. ;
Bookheimer, S. Y. ;
Dapretto, M. ;
Deen, B. ;
Delmonte, S. ;
Dinstein, I. ;
Ertl-Wagner, B. ;
Fair, D. A. ;
Gallagher, L. ;
Kennedy, D. P. ;
Keown, C. L. ;
Keysers, C. ;
Lainhart, J. E. ;
Lord, C. ;
Luna, B. ;
Menon, V. ;
Minshew, N. J. ;
Monk, C. S. ;
Mueller, S. ;
Mueller, R. A. ;
Nebel, M. B. ;
Nigg, J. T. ;
O'Hearn, K. ;
Pelphrey, K. A. ;
Peltier, S. J. ;
Rudie, J. D. ;
Sunaert, S. ;
Thioux, M. ;
Tyszka, J. M. ;
Uddin, L. Q. ;
Verhoeven, J. S. ;
Wenderoth, N. ;
Wiggins, J. L. ;
Mostofsky, S. H. ;
Milham, M. P. .
MOLECULAR PSYCHIATRY, 2014, 19 (06) :659-667
[6]   Outlier identification in high dimensions [J].
Filzmoser, Peter ;
Maronna, Ricardo ;
Werner, Mark .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) :1694-1711
[7]   Detecting outliers in high-dimensional neuroimaging datasets with robust covariance estimators [J].
Fritsch, Virgile ;
Varoquaux, Gael ;
Thyreau, Benjamin ;
Poline, Jean-Baptiste ;
Thirion, Bertrand .
MEDICAL IMAGE ANALYSIS, 2012, 16 (07) :1359-1370
[8]   A new approach for detecting multivariate outliers [J].
Gao, SG ;
Li, GY ;
Wang, DQ .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2005, 34 (08) :1857-1865
[9]   SINGULAR VALUE DECOMPOSITION AND LEAST SQUARES SOLUTIONS [J].
GOLUB, GH ;
REINSCH, C .
NUMERISCHE MATHEMATIK, 1970, 14 (05) :403-&
[10]   Detection of outliers [J].
Hadi, Ali S. ;
Imon, A. H. M. Rahmatullah ;
Werner, Mark .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2009, 1 (01) :57-70