Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads

被引:1492
作者
Ye, Kai [1 ,2 ,3 ,4 ]
Schulz, Marcel H. [1 ,5 ,6 ]
Long, Quan [7 ]
Apweiler, Rolf [1 ]
Ning, Zemin [7 ]
机构
[1] EMBL Outstat European Bioinformat Inst, Cambridge, England
[2] Leiden Univ, Med Ctr, Dept Mol Epidemiol, Leiden, Netherlands
[3] Leiden Univ, Dept Med Stat, Med Ctr, NL-2300 RA Leiden, Netherlands
[4] Leiden Univ, Dept Bioinformat, Med Ctr, NL-2300 RA Leiden, Netherlands
[5] Max Planck Inst Mol Genet, Berlin, Germany
[6] Int Max Planck Res Sch Computat Biol & Sci Comp, Berlin, Germany
[7] Wellcome Trust Sanger Inst, Cambridge, England
关键词
D O I
10.1093/bioinformatics/btp394
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There is a strong demand in the genomic community to develop effective algorithms to reliably identify genomic variants. Indel detection using next-gen data is difficult and identification of long structural variations is extremely challenging. Results: We present Pindel, a pattern growth approach, to detect breakpoints of large deletions and medium-sized insertions from paired-end short reads. We use both simulated reads and real data to demonstrate the efficiency of the computer program and accuracy of the results.
引用
收藏
页码:2865 / 2871
页数:7
相关论文
共 14 条
[1]   Natural genetic variation caused by transposable elements in humans [J].
Bennettt, EA ;
Coleman, LE ;
Tsui, C ;
Pittard, WS ;
Devine, SE .
GENETICS, 2004, 168 (02) :933-951
[2]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[3]   Short read fragment assembly of bacterial genomes [J].
Chaisson, Mark J. ;
Pevzner, Pavel A. .
GENOME RESEARCH, 2008, 18 (02) :324-330
[4]   Detection of large-scale variation in the human genome [J].
Iafrate, AJ ;
Feuk, L ;
Rivera, MN ;
Listewnik, ML ;
Donahoe, PK ;
Qi, Y ;
Scherer, SW ;
Lee, C .
NATURE GENETICS, 2004, 36 (09) :949-951
[5]   Mapping and sequencing of structural variation from eight human genomes (Reprinted from Nature, vol 453, pg 56-64, 2008) [J].
Kidd, Jeffrey M. ;
Cooper, Gregory M. ;
Donahue, William F. ;
Hayden, Hillary S. ;
Sampas, Nick ;
Graves, Tina ;
Hansen, Nancy ;
Teague, Brian ;
Alkan, Can ;
Antonacci, Francesca ;
Haugen, Eric ;
Zerr, Troy ;
Yamada, N. Alice ;
Tsang, Peter ;
Newman, Tera L. ;
Tuzun, Eray ;
Cheng, Ze ;
Ebling, Heather M. ;
Tusneem, Nadeem ;
David, Robert ;
Gillett, Will ;
Phelps, Karen A. ;
Weaver, Molly ;
Saranga, David ;
Brand, Adrianne ;
Tao, Wei ;
Gustafson, Erik ;
McKernan, Kevin ;
Chen, Lin ;
Malig, Maika ;
Smith, Joshua D. ;
Korn, Joshua M. ;
McCarroll, Steven A. ;
Altshuler, David A. ;
Peiffer, Daniel A. ;
Dorschner, Michael ;
Stamatoyannopoulos, John ;
Schwartz, David ;
Nickerson, Deborah A. ;
Mullikin, James C. ;
Wilson, Richard K. ;
Bruhn, Laurakay ;
Olson, Maynard V. ;
Kaul, Rajinder ;
Smith, Douglas R. ;
Eichler, Evan E. .
NATURE GENETICS, 2009, :S22-S30
[6]   The diploid genome sequence of an individual human [J].
Levy, Samuel ;
Sutton, Granger ;
Ng, Pauline C. ;
Feuk, Lars ;
Halpern, Aaron L. ;
Walenz, Brian P. ;
Axelrod, Nelson ;
Huang, Jiaqi ;
Kirkness, Ewen F. ;
Denisov, Gennady ;
Lin, Yuan ;
MacDonald, Jeffrey R. ;
Pang, Andy Wing Chun ;
Shago, Mary ;
Stockwell, Timothy B. ;
Tsiamouri, Alexia ;
Bafna, Vineet ;
Bansal, Vikas ;
Kravitz, Saul A. ;
Busam, Dana A. ;
Beeson, Karen Y. ;
Mclntosh, Tina C. ;
Remington, Karin A. ;
Abril, Josep F. ;
Gill, John ;
Borman, Jon ;
Rogers, Yu-Hui ;
Frazier, Marvin E. ;
Scherer, Stephen W. ;
Strausberg, Robert L. ;
Venter, J. Craig .
PLOS BIOLOGY, 2007, 5 (10) :2113-2144
[7]   An initial map of insertion and deletion (INDEL) variation in the human genome [J].
Mills, Ryan E. ;
Luttig, Christopher T. ;
Larkins, Christine E. ;
Beauchamp, Adam ;
Tsui, Circe ;
Pittard, W. Stephen ;
Devine, Scott E. .
GENOME RESEARCH, 2006, 16 (09) :1182-1190
[8]   SSAHA: A fast search method for large DNA databases [J].
Ning, ZM ;
Cox, AJ ;
Mullikin, JC .
GENOME RESEARCH, 2001, 11 (10) :1725-1729
[9]   Mining sequential patterns by pattern-growth: The PrefixSpan approach [J].
Pei, J ;
Han, JW ;
Mortazavi-Asl, B ;
Wang, JY ;
Pinto, H ;
Chen, QM ;
Dayal, U ;
Hsu, MC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (11) :1424-1440
[10]  
Schulz Marcel H, 2008, Int J Bioinform Res Appl, V4, P81, DOI 10.1504/IJBRA.2008.017165