EXCAVATOR: detecting copy number variants from whole-exome sequencing data

被引:180
作者
Magi, Alberto [1 ]
Tattini, Lorenzo [1 ,2 ]
Cifola, Ingrid [3 ]
D'Aurizio, Romina [4 ,5 ]
Benelli, Matteo [6 ]
Mangano, Eleonora [3 ]
Battaglia, Cristina [3 ,7 ]
Bonora, Elena [8 ]
Kurg, Ants [9 ]
Seri, Marco [8 ]
Magini, Pamela [8 ]
Giusti, Betti [1 ]
Romeo, Giovanni [8 ]
Pippucci, Tommaso [8 ]
De Bellis, Gianluca [3 ]
Abbate, Rosanna [1 ]
Gensini, Gian Franco [1 ]
机构
[1] Univ Florence, Dept Clin & Expt Med, Florence, Italy
[2] G Gaslini Inst Children, Mol Genet Lab, Genoa, Italy
[3] CNR, Inst Biomed Technol, Milan, Italy
[4] CNR, Inst Informat & Telemat, LISM, Pisa, Italy
[5] CNR, Inst Clin Physiol, Pisa, Italy
[6] Careggi Hosp, Diagnost Genet Unit, Florence, Italy
[7] Univ Milan, Dipartimento Biotecnol Med & Med Traslaz BIOMETRA, Milan, Italy
[8] Univ Bologna, Dept Med & Surg Sci, Med Genet Unit, Bologna, Italy
[9] Univ Tartu, Inst Mol & Cell Biol, EE-50090 Tartu, Estonia
关键词
STRUCTURAL VARIATION; SHORT-READ; GENOME; DISCOVERY; HETEROZYGOSITY; POLYMORPHISM; ABERRATIONS; ALGORITHMS; FRAMEWORK; DELETIONS;
D O I
10.1186/gb-2013-14-10-r120
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We developed a novel software tool, EXCAVATOR, for the detection of copy number variants (CNVs) from whole-exome sequencing data. EXCAVATOR combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number states. We validate EXCAVATOR on three datasets and compare the results with three other methods. These analyses show that EXCAVATOR outperforms the other methods and is therefore a valuable tool for the investigation of CNVs in largescale projects, as well as in clinical research and diagnostics. EXCAVATOR is freely available at http://sourceforge.net/projects/excavatortool/.
引用
收藏
页数:18
相关论文
共 45 条
[1]   APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping [J].
Alkan, Can ;
Coe, Bradley P. ;
Eichler, Evan E. .
NATURE REVIEWS GENETICS, 2011, 12 (05) :363-375
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]   A very fast and accurate method for calling aberrations in array-CGH data [J].
Benelli, Matteo ;
Marseglia, Giuseppina ;
Nannetti, Genni ;
Paravidino, Roberta ;
Zara, Federico ;
Bricarelli, Franca Dagna ;
Torricelli, Francesca ;
Magi, Alberto .
BIOSTATISTICS, 2010, 11 (03) :515-518
[4]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[5]   High-resolution mapping of copy-number alterations with massively parallel sequencing [J].
Chiang, Derek Y. ;
Getz, Gad ;
Jaffe, David B. ;
O'Kelly, Michael J. T. ;
Zhao, Xiaojun ;
Carter, Scott L. ;
Russ, Carsten ;
Nusbaum, Chad ;
Meyerson, Matthew ;
Lander, Eric S. .
NATURE METHODS, 2009, 6 (01) :99-103
[6]   Performance comparison of exome DNA sequencing technologies [J].
Clark, Michael J. ;
Chen, Rui ;
Lam, Hugo Y. K. ;
Karczewski, Konrad J. ;
Chen, Rong ;
Euskirchen, Ghia ;
Butte, Atul J. ;
Snyder, Michael .
NATURE BIOTECHNOLOGY, 2011, 29 (10) :908-U206
[7]   Origins and functional impact of copy number variation in the human genome [J].
Conrad, Donald F. ;
Pinto, Dalila ;
Redon, Richard ;
Feuk, Lars ;
Gokcumen, Omer ;
Zhang, Yujun ;
Aerts, Jan ;
Andrews, T. Daniel ;
Barnes, Chris ;
Campbell, Peter ;
Fitzgerald, Tomas ;
Hu, Min ;
Ihm, Chun Hwa ;
Kristiansson, Kati ;
MacArthur, Daniel G. ;
MacDonald, Jeffrey R. ;
Onyiah, Ifejinelo ;
Pang, Andy Wing Chun ;
Robson, Sam ;
Stirrups, Kathy ;
Valsesia, Armand ;
Walter, Klaudia ;
Wei, John ;
Tyler-Smith, Chris ;
Carter, Nigel P. ;
Lee, Charles ;
Scherer, Stephen W. ;
Hurles, Matthew E. .
NATURE, 2010, 464 (7289) :704-712
[8]   A copy number variation morbidity map of developmental delay [J].
Cooper, Gregory M. ;
Coe, Bradley P. ;
Girirajan, Santhosh ;
Rosenfeld, Jill A. ;
Vu, Tiffany H. ;
Baker, Carl ;
Williams, Charles ;
Stalker, Heather ;
Hamid, Rizwan ;
Hannig, Vickie ;
Abdel-Hamid, Hoda ;
Bader, Patricia ;
McCracken, Elizabeth ;
Niyazov, Dmitriy ;
Leppig, Kathleen ;
Thiese, Heidi ;
Hummel, Marybeth ;
Alexander, Nora ;
Gorski, Jerome ;
Kussmann, Jennifer ;
Shashi, Vandana ;
Johnson, Krys ;
Rehder, Catherine ;
Ballif, Blake C. ;
Shaffer, Lisa G. ;
Eichler, Evan E. .
NATURE GENETICS, 2011, 43 (09) :838-U44
[9]   Substantial biases in ultra-short read data sets from high-throughput DNA sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
[10]   Discovery and Statistical Genotyping of Copy-Number Variation from Whole-Exome Sequencing Depth [J].
Fromer, Menachem ;
Moran, Jennifer L. ;
Chambert, Kimberly ;
Banks, Eric ;
Bergen, Sarah E. ;
Ruderfer, Douglas M. ;
Handsaker, Robert E. ;
McCarroll, Steven A. ;
O'Donovan, Michael C. ;
Owen, Michael J. ;
Kirov, George ;
Sullivan, Patrick F. ;
Hultman, Christina M. ;
Sklar, Pamela ;
Purcell, Shaun M. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (04) :597-607