Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer

被引:34
作者
Larson, Nicholas B. [1 ]
Jenkins, Gregory D. [1 ]
Larson, Melissa C. [1 ]
Vierkant, Robert A. [1 ]
Sellers, Thomas A. [2 ]
Phelan, Catherine M. [2 ]
Schildkraut, Joellen M. [3 ]
Sutphen, Rebecca [4 ]
Pharoah, Paul P. D. [5 ]
Gayther, Simon A. [6 ]
Wentzensen, Nicolas [7 ]
Goode, Ellen L. [1 ]
Fridley, Brooke L. [1 ,8 ]
机构
[1] Mayo Clin, Dept Hlth Sci Res, Rochester, MN USA
[2] Univ S Florida, H Lee Moffitt Canc Ctr, Tampa, FL 33682 USA
[3] Duke Univ, Duke Comprehens Canc Ctr, Durham, NC USA
[4] Univ S Florida, Coll Med, Dept Pediat, Tampa, FL 33612 USA
[5] Univ Cambridge, Dept Oncol, Cambridge, England
[6] Univ So Calif, Dept Preventat Med, Los Angeles, CA USA
[7] NCI, Div Canc Epidemiol & Genet, Bethesda, MD 20892 USA
[8] Univ Kansas, Med Ctr, Dept Biostat, Kansas City, KS 66160 USA
基金
美国国家卫生研究院;
关键词
association studies; canonical correlation; gene-gene interaction; kernel methods; GENOME-WIDE ASSOCIATION; PRINCIPAL-COMPONENTS; GENOTYPE DATA; HUMAN-DISEASE; VARIANTS; RISK; SUSCEPTIBILITY; SEQUENCE; TESTS; MODEL;
D O I
10.1038/ejhg.2013.69
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Although single-locus approaches have been widely applied to identify disease-associated single-nucleotide polymorphisms (SNPs), complex diseases are thought to be the product of multiple interactions between loci. This has led to the recent development of statistical methods for detecting statistical interactions between two loci. Canonical correlation analysis (CCA) has previously been proposed to detect gene-gene coassociation. However, this approach is limited to detecting linear relations and can only be applied when the number of observations exceeds the number of SNPs in a gene. This limitation is particularly important for next-generation sequencing, which could yield a large number of novel variants on a limited number of subjects. To overcome these limitations, we propose an approach to detect gene-gene interactions on the basis of a kernelized version of CCA (KCCA). Our simulation studies showed that KCCA controls the Type-I error, and is more powerful than leading gene-based approaches under a disease model with negligible marginal effects. To demonstrate the utility of our approach, we also applied KCCA to assess interactions between 200 genes in the NF-kappa B pathway in relation to ovarian cancer risk in 3869 cases and 3276 controls. We identified 13 significant gene pairs relevant to ovarian cancer risk (local false discovery rate <0.05). Finally, we discuss the advantages of KCCA in gene-gene interaction analysis and its future role in genetic association studies.
引用
收藏
页码:126 / 131
页数:6
相关论文
共 35 条
  • [1] ASYMPTOTICS OF GRAPHICAL PROJECTION PURSUIT
    DIACONIS, P
    FREEDMAN, D
    [J]. ANNALS OF STATISTICS, 1984, 12 (03) : 793 - 815
  • [2] Large-scale simultaneous hypothesis testing: The choice of a null hypothesis
    Efron, B
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) : 96 - 104
  • [3] Systematic evaluation of genetic variants in the inflammation pathway and risk of lung cancer
    Engels, Eric A.
    Wu, Xifeng
    Gu, Jian
    Dong, Qiong
    Liu, Jun
    Spitz, Margaret R.
    [J]. CANCER RESEARCH, 2007, 67 (13) : 6520 - 6527
  • [4] Gene Set Analysis of Survival Following Ovarian Cancer Implicates Macrolide Binding and Intracellular Signaling Genes
    Fridley, Brooke L.
    Jenkins, Gregory D.
    Tsai, Ya-Yu
    Song, Honglin
    Bolton, Kelly L.
    Fenstermacher, David
    Tyrer, Jonathan
    Ramus, Susan J.
    Cunningham, Julie M.
    Vierkant, Robert A.
    Chen, Zhihua
    Chen, Y. Ann
    Iversen, Ed
    Menon, Usha
    Gentry-Maharaj, Aleksandra
    Schildkraut, Joellen
    Sutphen, Rebecca
    Gayther, Simon A.
    Hartmann, Lynn C.
    Pharoah, Paul D. P.
    Sellers, Thomas A.
    Goode, Ellen L.
    [J]. CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2012, 21 (03) : 529 - 536
  • [5] HINKLEY D, 1980, J ROY STAT SOC B MET, V42, P347
  • [6] Relations between two sets of variates
    Hotelling, H
    [J]. BIOMETRIKA, 1936, 28 : 321 - 377
  • [7] TESTS OF SIGNIFICANCE IN CANONICAL ANALYSIS
    LAWLEY, DN
    [J]. BIOMETRIKA, 1959, 46 (1-2) : 59 - 66
  • [8] GENE-CENTRIC GENE-GENE INTERACTION: A MODEL-BASED KERNEL MACHINE METHOD
    Li, Shaoyu
    Cui, Yuehua
    [J]. ANNALS OF APPLIED STATISTICS, 2012, 6 (03) : 1134 - 1161
  • [9] MaCH: Using Sequence and Genotype Data to Estimate Haplotypes and Unobserved Genotypes
    Li, Yun
    Willer, Cristen J.
    Ding, Jun
    Scheet, Paul
    Abecasis, Goncalo R.
    [J]. GENETIC EPIDEMIOLOGY, 2010, 34 (08) : 816 - 834
  • [10] Finding the missing heritability of complex diseases
    Manolio, Teri A.
    Collins, Francis S.
    Cox, Nancy J.
    Goldstein, David B.
    Hindorff, Lucia A.
    Hunter, David J.
    McCarthy, Mark I.
    Ramos, Erin M.
    Cardon, Lon R.
    Chakravarti, Aravinda
    Cho, Judy H.
    Guttmacher, Alan E.
    Kong, Augustine
    Kruglyak, Leonid
    Mardis, Elaine
    Rotimi, Charles N.
    Slatkin, Montgomery
    Valle, David
    Whittemore, Alice S.
    Boehnke, Michael
    Clark, Andrew G.
    Eichler, Evan E.
    Gibson, Greg
    Haines, Jonathan L.
    Mackay, Trudy F. C.
    McCarroll, Steven A.
    Visscher, Peter M.
    [J]. NATURE, 2009, 461 (7265) : 747 - 753