A new approach for efficient genotype imputation using information from relatives

被引:696
作者
Sargolzaei, Mehdi [1 ,2 ]
Chesnais, Jacques P. [2 ]
Schenkel, Flavio S. [1 ]
机构
[1] Univ Guelph, Dept Anim & Poultry Sci, Ctr Genet Improvement Livestock, Guelph, ON N1G 2W1, Canada
[2] Semex Alliance, Guelph, ON, Canada
关键词
Family; Imputation; Haplotype; Rare variant; Sliding window; GENOME-WIDE ASSOCIATION; LINKAGE DISEQUILIBRIUM; MISSING GENOTYPES; DENSITY GENOTYPES; CATTLE; MODEL;
D O I
10.1186/1471-2164-15-478
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Genotype imputation can help reduce genotyping costs particularly for implementation of genomic selection. In applications entailing large populations, recovering the genotypes of untyped loci using information from reference individuals that were genotyped with a higher density panel is computationally challenging. Popular imputation methods are based upon the Hidden Markov model and have computational constraints due to an intensive sampling process. A fast, deterministic approach, which makes use of both family and population information, is presented here. All individuals are related and, therefore, share haplotypes which may differ in length and frequency based on their relationships. The method starts with family imputation if pedigree information is available, and then exploits close relationships by searching for long haplotype matches in the reference group using overlapping sliding windows. The search continues as the window size is shrunk in each chromosome sweep in order to capture more distant relationships. Results: The proposed method gave higher or similar imputation accuracy than Beagle and Impute2 in cattle data sets when all available information was used. When close relatives of target individuals were present in the reference group, the method resulted in higher accuracy compared to the other two methods even when the pedigree was not used. Rare variants were also imputed with higher accuracy. Finally, computing requirements were considerably lower than those of Beagle and Impute2. The presented method took 28 minutes to impute from 6 k to 50 k genotypes for 2,000 individuals with a reference size of 64,429 individuals. Conclusions: The proposed method efficiently makes use of information from close and distant relatives for accurate genotype imputation. In addition to its high imputation accuracy, the method is fast, owing to its deterministic nature and, therefore, it can easily be used in large data sets where the use of other methods is impractical.
引用
收藏
页数:12
相关论文
共 32 条
[1]   A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) :210-223
[2]   Haplotype phasing: existing methods and new developments [J].
Browning, Sharon R. ;
Browning, Brian L. .
NATURE REVIEWS GENETICS, 2011, 12 (10) :703-714
[3]   In silico method for inferring genotypes in pedigrees [J].
Burdick, Joshua T. ;
Chen, Wei-Min ;
Abecasis, Goncalo R. ;
Cheung, Vivian G. .
NATURE GENETICS, 2006, 38 (09) :1002-1004
[4]   Uncovering the roles of rare variants in common disease through whole-genome sequencing [J].
Cirulli, Elizabeth T. ;
Goldstein, David B. .
NATURE REVIEWS GENETICS, 2010, 11 (06) :415-425
[5]   Imputation of Missing Genotypes From Sparse to High Density Using Long-Range Phasing [J].
Daetwyler, Hans D. ;
Wiggans, George R. ;
Hayes, Ben J. ;
Woolliams, John A. ;
Goddard, Mike E. .
GENETICS, 2011, 189 (01) :317-U1028
[6]   A Hidden Markov Model Combining Linkage and Linkage Disequilibrium Information for Haplotype Reconstruction and Quantitative Trait Locus Fine Mapping [J].
Druet, Tom ;
Georges, Michel .
GENETICS, 2010, 184 (03) :789-U237
[7]   Accuracy of genotype imputation in sheep breeds [J].
Hayes, B. J. ;
Bowman, P. J. ;
Daetwyler, H. D. ;
Kijas, J. W. ;
van der Werf, J. H. J. .
ANIMAL GENETICS, 2012, 43 (01) :72-80
[8]   Genome-wide association studies for common diseases and complex traits [J].
Hirschhorn, JN ;
Daly, MJ .
NATURE REVIEWS GENETICS, 2005, 6 (02) :95-108
[9]   Fast and accurate genotype imputation in genome-wide association studies through pre-phasing [J].
Howie, Bryan ;
Fuchsberger, Christian ;
Stephens, Matthew ;
Marchini, Jonathan ;
Abecasis, Goncalo R. .
NATURE GENETICS, 2012, 44 (08) :955-+
[10]   Genotype Imputation with Thousands of Genomes [J].
Howie, Bryan ;
Marchini, Jonathan ;
Stephens, Matthew .
G3-GENES GENOMES GENETICS, 2011, 1 (06) :457-469