Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics

被引:7
作者
Alam, Md. Ashad [1 ,3 ]
Calhoun, Vince D. [2 ]
Wang, Yu-Ping [1 ]
机构
[1] Tulane Univ, Dept Biomed Engn, New Orleans, LA 70118 USA
[2] Univ New Mexico, Dept Elect & Comp Engn, Albuquerque, NM 87131 USA
[3] Hajee Mohammad Danesh Sci & Technol Univ, Dept Stat, Dinajpur 5200, Bangladesh
关键词
Multiple kernel CCA; Influence function; Outlier detection; Multimodal datasets; Imaging genetics; ROBUSTNESS; CONSISTENCY;
D O I
10.1016/j.csda.2018.03.013
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Identifying significant outliers or atypical objects from multimodal datasets is an essential and challenging issue for biomedical research. This problem is addressed, using the influence function of multiple kernel canonical correlation analysis. First, the influence function (IF) of the kernel mean element, the kernel covariance operator, the kernel cross-covariance operator and kernel canonical correlation analysis (kernel CCA) are studied. Second, an IF of multiple kernel CCA is proposed, which can be applied to multimodal datasets. Third, a visualization method is proposed to detect influential observations of multiple sources of data based on the IF of kernel CCA and multiple kernel CCA. Finally, to validate the method, experiments on both synthesized and imaging genetics data (e.g., SNP, fMRI, and DNA methylation) are performed. To examine the outliers, both the stem-and-leaf display and distribution based technique are used. The performance of the proposed approach is illustrated on 116 candidate regions of interest (ROIs) from the fMRI data of schizophrenia study to identify significant ROIs. The proposed method and two state-of-the-art statistical methods have identified 8, 34, and 10 ROIs, respectively. Based on an online database, the brain mappings of the selected common 7 ROIs indicate the irregular brain regions susceptible to schizophrenia. The results demonstrate that the proposed method is capable of analyzing outliers and the influence of observations, and can be applicable to many other biomedical data which are often high-dimensional and multi-modal. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 85
页数:16
相关论文
共 50 条
[1]  
Akaho S., 2001, INT M PSYCH SOC, V35, P321
[2]  
Alam Ashad, 2008, 2008 11th International Conference on Computer and Information Technology (ICCIT), P399, DOI 10.1109/ICCITECHN.2008.4802966
[3]  
Alam M.A., 2014, Journal of Computer Science, V10, P1139, DOI [DOI 10.3844/JCSSP.2014.1139.1150, 10.3844/jcssp.2014.1139.1150]
[4]  
Alam M. A., 2010, J MULTIMEDIA, V5, P3
[5]  
Alam M. A., 2016, ARXIV160205563
[6]   Robust Kernel Canonical Correlation Analysis to Detect Gene-Gene Interaction for Imaging Genetics Data [J].
Alam, Md Ashad ;
Komori, Osamu ;
Calhoun, Vince ;
Wang, Yu-Ping .
PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2016, :279-288
[7]   Influence Function of Multiple Kernel Canonical Analysis to Identify Outliers in Imaging Genetics Data [J].
Alam, Md Ashad ;
Calhoun, Vince ;
Wang, Yu-Ping .
PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2016, :210-219
[8]   Higher-Order Regularized Kernel Canonical Correlation Analysis [J].
Alam, Md. Ashad ;
Fukumizu, Kenji .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (04)
[9]   Higher-order Regularized Kernel CCA [J].
Alam, Md. Ashad ;
Fukumizu, Kenji .
2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, 2013, :374-377
[10]  
Andreasen NC, 1984, Psychiatrie & Psychobiologie, DOI [10.1037/t48377-000, DOI 10.1037/T48377-000]