MetaSV: an accurate and integrative structural-variant caller for next generation sequencing

被引:107
作者
Mohiyuddin, Marghoob [1 ]
Mu, John C. [1 ]
Li, Jian [1 ]
Asadi, Narges Bani [1 ]
Gerstein, Mark B. [2 ]
Abyzov, Alexej [3 ]
Wong, Wing H. [4 ,5 ]
Lam, Hugo Y. K. [1 ]
机构
[1] Roche Sequencing, Bina Technol, Redwood City, CA 94065 USA
[2] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[3] Mayo Clin, Ctr Individualized Med, Dept Hlth Sci Res, Rochester, MN 55905 USA
[4] Stanford Univ, Dept Stat, Stanford, CA 94035 USA
[5] Stanford Univ, Dept Hlth Res & Policy, Stanford, CA 94035 USA
关键词
NUCLEOTIDE-RESOLUTION; PAIRED-END; GENOME; BREAKPOINTS; ALGORITHM;
D O I
10.1093/bioinformatics/btv204
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A Summary: Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous works have attempted to combine different methods, but they still suffer from poor accuracy particularly for insertions. We propose MetaSV, an integrated SV caller which leverages multiple orthogonal SV signals for high accuracy and resolution. MetaSV proceeds by merging SVs from multiple tools for all types of SVs. It also analyzes soft-clipped reads from alignment to detect insertions accurately since existing tools underestimate insertion SVs. Local assembly in combination with dynamic programming is used to improve breakpoint resolution. Paired-end and coverage information is used to predict SV genotypes. Using simulation and experimental data, we demonstrate the effectiveness of MetaSV across various SV types and sizes.
引用
收藏
页码:2741 / 2744
页数:4
相关论文
共 16 条
[1]   CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing [J].
Abyzov, Alexej ;
Urban, Alexander E. ;
Snyder, Michael ;
Gerstein, Mark .
GENOME RESEARCH, 2011, 21 (06) :974-984
[2]   Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms [J].
Abyzov, Alexej ;
Li, Shantao ;
Kim, Daniel Rhee ;
Mohiyuddin, Marghoob ;
Stuetz, Adrian M. ;
Parrish, Nicholas F. ;
Mu, Xinmeng Jasmine ;
Clark, Wyatt ;
Chen, Ken ;
Hurles, Matthew ;
Korbel, Jan O. ;
Lam, Hugo Y. K. ;
Lee, Charles ;
Gerstein, Mark B. .
NATURE COMMUNICATIONS, 2015, 6
[3]   AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision [J].
Abyzov, Alexej ;
Gerstein, Mark .
BIOINFORMATICS, 2011, 27 (05) :595-603
[4]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[5]  
Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/nmeth.1363, 10.1038/NMETH.1363]
[6]   Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays [J].
Drmanac, Radoje ;
Sparks, Andrew B. ;
Callow, Matthew J. ;
Halpern, Aaron L. ;
Burns, Norman L. ;
Kermani, Bahram G. ;
Carnevali, Paolo ;
Nazarenko, Igor ;
Nilsen, Geoffrey B. ;
Yeung, George ;
Dahl, Fredrik ;
Fernandez, Andres ;
Staker, Bryan ;
Pant, Krishna P. ;
Baccash, Jonathan ;
Borcherding, Adam P. ;
Brownley, Anushka ;
Cedeno, Ryan ;
Chen, Linsu ;
Chernikoff, Dan ;
Cheung, Alex ;
Chirita, Razvan ;
Curson, Benjamin ;
Ebert, Jessica C. ;
Hacker, Coleen R. ;
Hartlage, Robert ;
Hauser, Brian ;
Huang, Steve ;
Jiang, Yuan ;
Karpinchyk, Vitali ;
Koenig, Mark ;
Kong, Calvin ;
Landers, Tom ;
Le, Catherine ;
Liu, Jia ;
McBride, Celeste E. ;
Morenzoni, Matt ;
Morey, Robert E. ;
Mutch, Karl ;
Perazich, Helena ;
Perry, Kimberly ;
Peters, Brock A. ;
Peterson, Joe ;
Pethiyagoda, Charit L. ;
Pothuraju, Kaliprasad ;
Richter, Claudia ;
Rosenbaum, Abraham M. ;
Roy, Shaunak ;
Shafto, Jay ;
Sharanhovich, Uladzislau .
SCIENCE, 2010, 327 (5961) :78-81
[7]   Detecting and annotating genetic variations using the HugeSeq pipeline [J].
Lam, Hugo Y. K. ;
Pan, Cuiping ;
Clark, Michael J. ;
Lacroute, Phil ;
Chen, Rui ;
Haraksingh, Rajini ;
O'Huallachain, Maeve ;
Gerstein, Mark B. ;
Kidd, Jeffrey M. ;
Bustamante, Carlos D. ;
Snyder, Michael .
NATURE BIOTECHNOLOGY, 2012, 30 (03) :226-229
[8]   Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library [J].
Lam, Hugo Y. K. ;
Mu, Xinmeng Jasmine ;
Stuetz, Adrian M. ;
Tanzer, Andrea ;
Cayting, Philip D. ;
Snyder, Michael ;
Kim, Philip M. ;
Korbel, Jan O. ;
Gerstein, Mark B. .
NATURE BIOTECHNOLOGY, 2010, 28 (01) :47-U76
[9]   LUMPY: a probabilistic framework for structural variant discovery [J].
Layer, Ryan M. ;
Chiang, Colby ;
Quinlan, Aaron R. ;
Hall, Ira M. .
GENOME BIOLOGY, 2014, 15 (06)
[10]   Mapping copy number variation by population-scale genome sequencing [J].
Mills, Ryan E. ;
Walter, Klaudia ;
Stewart, Chip ;
Handsaker, Robert E. ;
Chen, Ken ;
Alkan, Can ;
Abyzov, Alexej ;
Yoon, Seungtai Chris ;
Ye, Kai ;
Cheetham, R. Keira ;
Chinwalla, Asif ;
Conrad, Donald F. ;
Fu, Yutao ;
Grubert, Fabian ;
Hajirasouliha, Iman ;
Hormozdiari, Fereydoun ;
Iakoucheva, Lilia M. ;
Iqbal, Zamin ;
Kang, Shuli ;
Kidd, Jeffrey M. ;
Konkel, Miriam K. ;
Korn, Joshua ;
Khurana, Ekta ;
Kural, Deniz ;
Lam, Hugo Y. K. ;
Leng, Jing ;
Li, Ruiqiang ;
Li, Yingrui ;
Lin, Chang-Yun ;
Luo, Ruibang ;
Mu, Xinmeng Jasmine ;
Nemesh, James ;
Peckham, Heather E. ;
Rausch, Tobias ;
Scally, Aylwyn ;
Shi, Xinghua ;
Stromberg, Michael P. ;
Stuetz, Adrian M. ;
Urban, Alexander Eckehart ;
Walker, Jerilyn A. ;
Wu, Jiantao ;
Zhang, Yujun ;
Zhang, Zhengdong D. ;
Batzer, Mark A. ;
Ding, Li ;
Marth, Gabor T. ;
McVean, Gil ;
Sebat, Jonathan ;
Snyder, Michael ;
Wang, Jun .
NATURE, 2011, 470 (7332) :59-65