Identifying diagnosis-specific genotype-phenotype associations via joint multitask sparse canonical correlation analysis and classification

被引:28
作者
Du, Lei [1 ]
Liu, Fang [1 ]
Liu, Kefei [2 ]
Yao, Xiaohui [2 ]
Risacher, Shannon L. [3 ]
Han, Junwei [1 ]
Guo, Lei [1 ]
Saykin, Andrew J. [3 ]
Shen, Li [2 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Dept Intelligent Sci & Technol, Xian 710072, Peoples R China
[2] Univ Penn, Dept Biostat Epidemiol & Informat, Perelman Sch Med, Philadelphia, PA 19104 USA
[3] Indiana Univ Sch Med, Dept Radiol & Imaging Sci, Indianapolis, IN 46202 USA
基金
中国博士后科学基金; 美国国家卫生研究院; 中国国家自然科学基金;
关键词
QUANTITATIVE TRAIT LOCI; ALZHEIMERS-DISEASE; IMAGING GENETICS; FEATURE-SELECTION; REGRESSION; OPTIMIZATION; MCI; AD;
D O I
10.1093/bioinformatics/btaa434
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bimultivariate genotype-phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype-phenotype associations. Results: In this article, we propose a new joint multitask learning method, named MT-SCCALR, which absorbs the merits of both SCCA and logistic regression. MT-SCCALR learns genotype-phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype-phenotype pattern. Meanwhile, MTSCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT-SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype-phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders.
引用
收藏
页码:371 / 379
页数:9
相关论文
共 45 条
[1]  
Bagenstoss PM, 1999, IEEE T SIGNAL PROCES, V47, P3428, DOI 10.1109/78.806092
[2]  
Beaton D., 2014, INT C PART LEAST SQU, P73
[3]   Complex brain networks: graph theoretical analysis of structural and functional systems [J].
Bullmore, Edward T. ;
Sporns, Olaf .
NATURE REVIEWS NEUROSCIENCE, 2009, 10 (03) :186-198
[4]   Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis [J].
Chen, Jun ;
Bushman, Frederic D. ;
Lewis, James D. ;
Wu, Gary D. ;
Li, Hongzhe .
BIOSTATISTICS, 2013, 14 (02) :244-258
[5]   An Efficient Optimization Algorithm for Structured Sparse CCA, with Applications to eQTL Mapping [J].
Chen X. ;
Liu H. .
Statistics in Biosciences, 2012, 4 (1) :3-26
[6]   Gender and incidence of dementia in the Framingham Heart Study from mid-adult life [J].
Chene, Genevieve ;
Beiser, Alexa ;
Au, Rhoda ;
Preis, Sarah R. ;
Wolf, Philip A. ;
Dufouil, Carole ;
Seshadri, Sudha .
ALZHEIMERS & DEMENTIA, 2015, 11 (03) :310-320
[7]   Detecting genetic associations with brain imaging phenotypes in Alzheimer's disease via a novel structured SCCA approach [J].
Du, Lei ;
Liu, Kefei ;
Yao, Xiaohui ;
Risacher, Shannon L. ;
Han, Junwei ;
Saykin, Andrew J. ;
Guo, Lei ;
Shen, Li .
MEDICAL IMAGE ANALYSIS, 2020, 61
[8]   Multi-Task Sparse Canonical Correlation Analysis with Application to Multi-Modal Brain Imaging Genetics [J].
Du, Lei ;
Liu, Kefei ;
Yao, Xiaohui ;
Risacher, Shannon L. ;
Han, Junwei ;
Saykin, Andrew J. ;
Guo, Lei ;
Shen, Li .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (01) :227-239
[9]   Identifying progressive imaging genetic patterns via multi-task sparse canonical correlation analysis: a longitudinal study of the ADNI cohort [J].
Du, Lei ;
Liu, Kefei ;
Zhu, Lei ;
Yao, Xiaohui ;
Risacher, Shannon L. ;
Guo, Lei ;
Saykin, Andrew J. ;
Shen, Li .
BIOINFORMATICS, 2019, 35 (14) :I474-I483
[10]   A novel SCCA approach via truncated l1-norm and truncated group lasso for brain imaging genetics [J].
Du, Lei ;
Liu, Kefei ;
Zhang, Tuo ;
Yao, Xiaohui ;
Yan, Jingwen ;
Risacher, Shannon L. ;
Han, Junwei ;
Guo, Lei ;
Saykin, Andrew J. ;
Shen, Li .
BIOINFORMATICS, 2018, 34 (02) :278-285