MoGUL: Detecting Common Insertions and Deletions in a Population

被引:0
作者
Lee, Seunghak [1 ,2 ]
Xing, Eric [2 ]
Brudno, Michael [1 ,3 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
[3] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON, Canada
来源
RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS | 2010年 / 6044卷
关键词
STRUCTURAL VARIATION; HUMAN GENOME; DISCOVERY;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
While the discovery of structural variants in the human population is ongoing, most methods for this task assume that the genome is sequenced to high coverage (e.g. 40x), and use the combined power of the many sequenced reads and mate pairs to identify the variants. In contrast, the 1000 Genomes Project hopes to sequence hundreds of human genotypes, but at low coverage (4-6x), and most of the current methods are unable to discover insertion/deletion and structural variants from this data. In order to identify indels from multiple low-coverage individuals we have developed the MoGUL (Mixture of Genotypes Variant Locator) framework, which identifies potential locations with indels by examining mate pairs generated from all sequenced individuals simultaneously, uses a Bayesian network with appropriate priors to explicitly model each individual as homozygous or heterozygous for each locus, and computes the expected Minor Allele Frequency (MAF) for all predicted variants. We have used MoGUL to identify variants in 1000 Genomes data, as well as in simulated genotypes, and show good accuracy at predicting indels, especially for MAF > 0.06 and indel size > 20 base pairs.
引用
收藏
页码:357 / +
页数:3
相关论文
共 14 条
[1]   Personalized copy number and segmental duplication maps using next-generation sequencing [J].
Alkan, Can ;
Kidd, Jeffrey M. ;
Marques-Bonet, Tomas ;
Aksay, Gozde ;
Antonacci, Francesca ;
Hormozdiari, Fereydoun ;
Kitzman, Jacob O. ;
Baker, Carl ;
Malig, Maika ;
Mutlu, Onur ;
Sahinalp, S. Cenk ;
Gibbs, Richard A. ;
Eichler, Evan E. .
NATURE GENETICS, 2009, 41 (10) :1061-U29
[2]  
Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/NMETH.1363, 10.1038/nmeth.1363]
[3]   A probabilistic approach for SNP discovery in high-throughput human resequencing data [J].
Hoberman, Rose ;
Dias, Joana ;
Ge, Bing ;
Harmsen, Eef ;
Mayhew, Michael ;
Verlaan, Dominique J. ;
Kwan, Tony ;
Dewar, Ken ;
Blanchette, Mathieu ;
Pastinen, Tomi .
GENOME RESEARCH, 2009, 19 (09) :1542-1552
[4]   Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes [J].
Hormozdiari, Fereydoun ;
Alkan, Can ;
Eichler, Evan E. ;
Sahinalp, S. Cenk .
GENOME RESEARCH, 2009, 19 (07) :1270-1278
[5]  
Korbel JO, 2007, SCIENCE, V318, P420, DOI 10.1126/science.1149504
[6]   Power of deep, all-exon resequencing for discovery of human trait genes [J].
Kryukov, Gregory V. ;
Shpunt, Alexander ;
Stamatoyannopoulos, John A. ;
Sunyaev, Shamil R. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (10) :3871-3876
[7]   A robust framework for detecting structural variations in a genome [J].
Lee, Seunghak ;
Cheran, Elango ;
Brudno, Michael .
BIOINFORMATICS, 2008, 24 (13) :I59-I67
[8]   MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions [J].
Lee, Seunghak ;
Hormozdiari, Fereydoun ;
Alkan, Can ;
Brudno, Michael .
NATURE METHODS, 2009, 6 (07) :473-474
[9]   Mapping short DNA sequencing reads and calling variants using mapping quality scores [J].
Li, Heng ;
Ruan, Jue ;
Durbin, Richard .
GENOME RESEARCH, 2008, 18 (11) :1851-1858
[10]   SNP detection for massively parallel whole-genome resequencing [J].
Li, Ruiqiang ;
Li, Yingrui ;
Fang, Xiaodong ;
Yang, Huanming ;
Wang, Jian ;
Kristiansen, Karsten ;
Wang, Jun .
GENOME RESEARCH, 2009, 19 (06) :1124-1132