Inferring copy number and genotype in tumour exome data

被引:87
作者
Amarasinghe, Kaushalya C. [1 ]
Li, Jason [1 ,2 ]
Hunter, Sally M. [3 ]
Ryland, Georgina L. [3 ]
Cowin, Prue A. [4 ]
Campbell, Ian G. [3 ,5 ,6 ]
Halgamuge, Saman K. [1 ]
机构
[1] Univ Melbourne, Optimisat & Pattern Recognit Grp, Dept Mech Engn, Melbourne Sch Engn, Parkville, Vic 3010, Australia
[2] Peter MacCallum Canc Ctr, Bioinformat Core Facil, East Melbourne, Vic 3002, Australia
[3] Peter MacCallum Canc Ctr, Canc Genet Lab, East Melbourne, Vic 3002, Australia
[4] Peter MacCallum Canc Ctr, Canc Genom & Genet Lab, East Melbourne, Vic 3002, Australia
[5] Univ Melbourne, Sir Peter MacCallum Dept Oncol, Parkville, Vic 3010, Australia
[6] Univ Melbourne, Dept Pathol, Parkville, Vic 3010, Australia
来源
BMC GENOMICS | 2014年 / 15卷
基金
澳大利亚研究理事会;
关键词
HIDDEN MARKOV-MODELS; CANCER; BREAST; HETEROZYGOSITY; IDENTIFICATION; MUTATIONS; DISCOVERY; ACCURATE; CAPTURE;
D O I
10.1186/1471-2164-15-732
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation. Results: We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure. Conclusions: Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/.
引用
收藏
页数:12
相关论文
共 31 条
[1]   Chromosome aberrations in solid tumors [J].
Albertson, DG ;
Collins, C ;
McCormick, F ;
Gray, JW .
NATURE GENETICS, 2003, 34 (04) :369-376
[2]   CoNVEX: copy number variation estimation in exome sequencing data using HMM [J].
Amarasinghe, Kaushalya C. ;
Li, Jason ;
Halgamuge, Saman K. .
BMC BIOINFORMATICS, 2013, 14
[3]   Sequence analysis of mutations and translocations across breast cancer subtypes [J].
Banerji, Shantanu ;
Cibulskis, Kristian ;
Rangel-Escareno, Claudia ;
Brown, Kristin K. ;
Carter, Scott L. ;
Frederick, Abbie M. ;
Lawrence, Michael S. ;
Sivachenko, Andrey Y. ;
Sougnez, Carrie ;
Zou, Lihua ;
Cortes, Maria L. ;
Fernandez-Lopez, Juan C. ;
Peng, Shouyong ;
Ardlie, Kristin G. ;
Auclair, Daniel ;
Bautista-Pina, Veronica ;
Duke, Fujiko ;
Francis, Joshua ;
Jung, Joonil ;
Maffuz-Aziz, Antonio ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Pho, Nam H. ;
Quintanar-Jurado, Valeria ;
Ramos, Alex H. ;
Rebollar-Vega, Rosa ;
Rodriguez-Cuevas, Sergio ;
Romero-Cordoba, Sandra L. ;
Schumacher, Steven E. ;
Stransky, Nicolas ;
Thompson, Kristin M. ;
Uribe-Figueroa, Laura ;
Baselga, Jose ;
Beroukhim, Rameen ;
Polyak, Kornelia ;
Sgroi, Dennis C. ;
Richardson, Andrea L. ;
Jimenez-Sanchez, Gerardo ;
Lander, Eric S. ;
Gabriel, Stacey B. ;
Garraway, Levi A. ;
Golub, Todd R. ;
Melendez-Zajgla, Jorge ;
Toker, Alex ;
Getz, Gad ;
Hidalgo-Miranda, Alfredo ;
Meyerson, Matthew .
NATURE, 2012, 486 (7403) :405-409
[4]   Integrated genomic analyses of ovarian carcinoma [J].
Bell, D. ;
Berchuck, A. ;
Birrer, M. ;
Chien, J. ;
Cramer, D. W. ;
Dao, F. ;
Dhir, R. ;
DiSaia, P. ;
Gabra, H. ;
Glenn, P. ;
Godwin, A. K. ;
Gross, J. ;
Hartmann, L. ;
Huang, M. ;
Huntsman, D. G. ;
Iacocca, M. ;
Imielinski, M. ;
Kalloger, S. ;
Karlan, B. Y. ;
Levine, D. A. ;
Mills, G. B. ;
Morrison, C. ;
Mutch, D. ;
Olvera, N. ;
Orsulic, S. ;
Park, K. ;
Petrelli, N. ;
Rabeno, B. ;
Rader, J. S. ;
Sikic, B. I. ;
Smith-McCune, K. ;
Sood, A. K. ;
Bowtell, D. ;
Penny, R. ;
Testa, J. R. ;
Chang, K. ;
Dinh, H. H. ;
Drummond, J. A. ;
Fowler, G. ;
Gunaratne, P. ;
Hawes, A. C. ;
Kovar, C. L. ;
Lewis, L. R. ;
Morgan, M. B. ;
Newsham, I. F. ;
Santibanez, J. ;
Reid, J. G. ;
Trevino, L. R. ;
Wu, Y. -Q. ;
Wang, M. .
NATURE, 2011, 474 (7353) :609-615
[5]   Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data [J].
Boeva, Valentina ;
Popova, Tatiana ;
Bleakley, Kevin ;
Chiche, Pierre ;
Cappo, Julie ;
Schleiermacher, Gudrun ;
Janoueix-Lerosey, Isabelle ;
Delattre, Olivier ;
Barillot, Emmanuel .
BIOINFORMATICS, 2012, 28 (03) :423-425
[6]   QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data [J].
Colella, Stefano ;
Yau, Christopher ;
Taylor, Jennifer M. ;
Mirza, Ghazala ;
Butler, Helen ;
Clouston, Penny ;
Bassett, Anne S. ;
Seller, Anneke ;
Holmes, Christopher C. ;
Ragoussis, Jiannis .
NUCLEIC ACIDS RESEARCH, 2007, 35 (06) :2013-2025
[7]  
DeVita VT., 2005, Cancer, principles practice of oncology, V7th
[8]   Hidden Markov models approach to the analysis of array CGH data [J].
Fridlyand, J ;
Snijders, AM ;
Pinkel, D ;
Albertson, DG ;
Jain, AN .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :132-153
[9]   Discovery and Statistical Genotyping of Copy-Number Variation from Whole-Exome Sequencing Depth [J].
Fromer, Menachem ;
Moran, Jennifer L. ;
Chambert, Kimberly ;
Banks, Eric ;
Bergen, Sarah E. ;
Ruderfer, Douglas M. ;
Handsaker, Robert E. ;
McCarroll, Steven A. ;
O'Donovan, Michael C. ;
Owen, Michael J. ;
Kirov, George ;
Sullivan, Patrick F. ;
Hultman, Christina M. ;
Sklar, Pamela ;
Purcell, Shaun M. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (04) :597-607
[10]   Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer [J].
Ha, Gavin ;
Roth, Andrew ;
Lai, Daniel ;
Bashashati, Ali ;
Ding, Jiarui ;
Goya, Rodrigo ;
Giuliany, Ryan ;
Rosner, Jamie ;
Oloumi, Arusha ;
Shumansky, Karey ;
Chin, Suet-Feung ;
Turashvili, Gulisa ;
Hirst, Martin ;
Caldas, Carlos ;
Marra, Marco A. ;
Aparicio, Samuel ;
Shah, Sohrab P. .
GENOME RESEARCH, 2012, 22 (10) :1995-2007