Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data

被引:145
作者
Yuan, Lei [2 ]
Wang, Yalin
Thompson, Paul M. [3 ]
Narayan, Vaibhav A. [4 ]
Ye, Jieping [1 ,2 ]
机构
[1] Arizona State Univ, Biodesign Inst, Dept Comp Sci & Engn, Ctr Evolutionary Med & Informat, Tempe, AZ 85287 USA
[2] Arizona State Univ, Sch Comp Informat & Decis Syst Engn, Tempe, AZ 85287 USA
[3] Univ Calif Los Angeles, Dept Neurol, Lab Neuro Imaging, Los Angeles, CA 90024 USA
[4] Johnson & Johnson Pharmaceut Res & Dev LLC, Titusville, NJ USA
基金
美国国家卫生研究院; 美国国家科学基金会; 加拿大健康研究院;
关键词
Multi-source feature learning; Multi-task learning; Incomplete data; Ensemble; ALZHEIMERS-DISEASE; COMPONENT ANALYSIS; CSF BIOMARKERS; MISSING DATA; FUSING FMRI; MRI; REGRESSION; ATROPHY; CLASSIFICATION; HIPPOCAMPAL;
D O I
10.1016/j.neuroimage.2012.03.059
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Analysis of incomplete data is a big challenge when integrating large-scale brain imaging datasets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. In this paper, we address this problem by proposing an incomplete Multi-Source Feature (iMSF) learning method where all the samples (with at least one available data source) can be used. To illustrate the proposed approach, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 NC), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithm. Depending on the problem being solved, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. To build a practical and robust system, we construct a classifier ensemble by combining our method with four other methods for missing value estimation. Comprehensive experiments with various parameters show that our proposed iMSF method and the ensemble model yield stable and promising results. (c) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:622 / 632
页数:11
相关论文
共 60 条
  • [1] Ando RK, 2005, J MACH LEARN RES, V6, P1817
  • [2] [Anonymous], 2013, Advances in neural information processing systems, DOI DOI 10.1109/TPAMI.2013.17
  • [3] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [4] [Anonymous], 1999, Imputing Missing Data for Gene Expression Arrays
  • [5] [Anonymous], 2003, INTRO LECT CONVEX OP
  • [6] [Anonymous], 2008, P 14 ACM SIGKDD INT, DOI [DOI 10.1145/1401890.1402012, 10.1145]
  • [7] Convex multi-task feature learning
    Argyriou, Andreas
    Evgeniou, Theodoros
    Pontil, Massimiliano
    [J]. MACHINE LEARNING, 2008, 73 (03) : 243 - 272
  • [8] Multimodal image coregistration and partitioning - A unified framework
    Ashburner, J
    Friston, K
    [J]. NEUROIMAGE, 1997, 6 (03) : 209 - 217
  • [9] Plaque and tangle imaging and cognition in normal aging and Alzheimer's disease
    Braskie, Meredith N.
    Klunder, Andrea D.
    Hayashi, Kiralee M.
    Protas, Hillary
    Kepe, Vladimir
    Miller, Karen J.
    Huang, S. -C.
    Barrio, Jorge R.
    Ercoli, Linda M.
    Siddarth, Prabha
    Satyamurthy, Nagichettiar
    Liu, Jie
    Toga, Arthur W.
    Bookheimer, Susan Y.
    Small, Gary W.
    Thompson, Paul M.
    [J]. NEUROBIOLOGY OF AGING, 2010, 31 (10) : 1669 - 1678
  • [10] A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
    Cai, Jian-Feng
    Candes, Emmanuel J.
    Shen, Zuowei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) : 1956 - 1982