Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort

被引:14
|
作者
Valsesia, Armand [1 ,2 ,4 ]
Stevenson, Brian J. [2 ,4 ]
Waterworth, Dawn [3 ]
Mooser, Vincent [3 ,5 ]
Vollenweider, Peter [5 ]
Waeber, Gerard [5 ]
Jongeneel, C. Victor [2 ,4 ,6 ,7 ]
Beckmann, Jacques S. [1 ,8 ]
Kutalik, Zoltan [1 ,2 ]
Bergmann, Sven [1 ,2 ]
机构
[1] Univ Lausanne, Dept Med Genet, Lausanne, Switzerland
[2] Swiss Inst Bioinformat, Lausanne, Switzerland
[3] GlaxoSmithKline, Med Genet Clin Pharmacol & Discovery Med, Philadelphia, PA USA
[4] Ludwig Inst Canc Res, Lausanne, Switzerland
[5] CHU Vaudois, Dept Med, CH-1011 Lausanne, Switzerland
[6] Univ Illinois, Inst Genom Biol, Chicago, IL 60680 USA
[7] Univ Illinois, Natl Ctr Supercomp Applicat, Chicago, IL 60680 USA
[8] CHU Vaudois, Serv Med Genet, CH-1011 Lausanne, Switzerland
来源
BMC GENOMICS | 2012年 / 13卷
关键词
GENOME-WIDE ASSOCIATION; CIRCULAR BINARY SEGMENTATION; HIDDEN-MARKOV MODEL; STRUCTURAL VARIATION; SUSCEPTIBILITY LOCI; CGH MICROARRAYS; RESOLUTION; POPULATION; ALGORITHMS; DELETIONS;
D O I
10.1186/1471-2164-13-241
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. Results: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. Conclusion: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.
引用
收藏
页数:15
相关论文
共 44 条
  • [1] Detecting large copy number variants using exome genotyping arrays in a large Swedish schizophrenia sample
    Szatkiewicz, J. P.
    Neale, B. M.
    O'Dushlaine, C.
    Fromer, M.
    Goldstein, J. I.
    Moran, J. L.
    Chambert, K.
    Kahler, A.
    Magnusson, P. K. E.
    Hultman, C. M.
    Sklar, P.
    Purcell, S.
    McCarroll, S. A.
    Sullivan, P. F.
    MOLECULAR PSYCHIATRY, 2013, 18 (11) : 1178 - 1184
  • [2] Detecting large copy number variants using exome genotyping arrays in a large Swedish schizophrenia sample
    J P Szatkiewicz
    B M Neale
    C O'Dushlaine
    M Fromer
    J I Goldstein
    J L Moran
    K Chambert
    A Kähler
    P K E Magnusson
    C M Hultman
    P Sklar
    S Purcell
    S A McCarroll
    P F Sullivan
    Molecular Psychiatry, 2013, 18 : 1178 - 1184
  • [3] Genome-Wide Copy Number Variations Inferred from SNP Genotyping Arrays Using a Large White and Minzhu Intercross Population
    Wang, Ligang
    Liu, Xin
    Zhang, Longchao
    Yan, Hua
    Luo, Weizhen
    Liang, Jing
    Cheng, Duxue
    Chen, Shaokang
    Ma, Xiaojun
    Song, Xin
    Zhao, Kebin
    Wang, Lixian
    PLOS ONE, 2013, 8 (10):
  • [4] Accurate and Effective Detection of Recurrent Copy Number Variants in Large SNP Genotype Datasets
    Montalbano, Simone
    Sanchez, Xabier Calle
    Vaez, Morteza
    Helenius, Dorte
    Werge, Thomas
    Ingason, Andres
    CURRENT PROTOCOLS, 2022, 2 (12):
  • [5] A genome-wide detection of copy number variation using SNP genotyping arrays in Beijing-You chickens
    Zhou, Wei
    Liu, Ranran
    Zhang, Jingjing
    Zheng, Maiqing
    Li, Peng
    Chang, Guobin
    Wen, Jie
    Zhao, Guiping
    GENETICA, 2014, 142 (05) : 441 - 450
  • [6] An Algorithm for Detecting High Frequency Copy Number Polymorphisms Using SNP Arrays
    Halldorsson, Bjarni V.
    Gudbjartsson, Daniel F.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2011, 18 (08) : 955 - 966
  • [7] Genome-wide identification of copy number variations in Holstein cattle from Baja California, Mexico, using high-density SNP genotyping arrays
    Salomon-Torres, R.
    Gonzalez-Vizcarra, V. M.
    Medina-Basulto, G. E.
    Montano-Gomez, M. F.
    Mahadevan, P.
    Yaurima-Basaldua, V. H.
    Villa-Angulo, C.
    Villa-Angulo, R.
    GENETICS AND MOLECULAR RESEARCH, 2015, 14 (04) : 11848 - 11859
  • [8] A systematic benchmark of copy number variation detection tools for high density SNP genotyping arrays
    van Baardwijk, M. N.
    Heijnen, L. S. E. M.
    Zhao, H.
    Baudis, M.
    Stubbs, A. P.
    GENOMICS, 2024, 116 (06)
  • [9] PlatinumCNV: A Bayesian Gaussian mixture model for genotyping copy number polymorphisms using SNP array signal intensity data
    Kumasaka, Natsuhiko
    Fujisawa, Hironori
    Hosono, Naoya
    Okada, Yukinori
    Takahashi, Atsushi
    Nakamura, Yusuke
    Kubo, Michiaki
    Kamatani, Naoyuki
    GENETIC EPIDEMIOLOGY, 2011, 35 (08) : 831 - 844
  • [10] Fast detection of de novo copy number variants from SNP arrays for case-parent trios
    Scharpf, Robert B.
    Beaty, Terri H.
    Schwender, Holger
    Younkin, Samuel G.
    Scott, Alan F.
    Ruczinski, Ingo
    BMC BIOINFORMATICS, 2012, 13