Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

被引:4
作者
Smith, Thomas C. A. [1 ]
Carr, Antony M. [2 ]
Eyre-Walker, Adam C. [1 ]
机构
[1] Univ Sussex, Sch Life Sci, Brighton, E Sussex, England
[2] Univ Sussex, Genome Damage & Stabil Ctr, Brighton, E Sussex, England
来源
PEERJ | 2016年 / 4卷
关键词
Cancer; Somatic; Variation; Mutation; Mutation rate variation; Sequencing error; MUTATION-RATE; CHROMATIN ORGANIZATION; WHOLE-GENOME; PATTERNS; RATES; SUBSTITUTION; HUMANS; GENES; DNA; PSEUDOGENES;
D O I
10.7717/peerj.2391
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that similar to 4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites.
引用
收藏
页数:15
相关论文
共 42 条
[1]   Signatures of mutational processes in human cancer [J].
Alexandrov, Ludmil B. ;
Nik-Zainal, Serena ;
Wedge, David C. ;
Aparicio, Samuel A. J. R. ;
Behjati, Sam ;
Biankin, Andrew V. ;
Bignell, Graham R. ;
Bolli, Niccolo ;
Borg, Ake ;
Borresen-Dale, Anne-Lise ;
Boyault, Sandrine ;
Burkhardt, Birgit ;
Butler, Adam P. ;
Caldas, Carlos ;
Davies, Helen R. ;
Desmedt, Christine ;
Eils, Roland ;
Eyfjord, Jorunn Erla ;
Foekens, John A. ;
Greaves, Mel ;
Hosoda, Fumie ;
Hutter, Barbara ;
Ilicic, Tomislav ;
Imbeaud, Sandrine ;
Imielinsk, Marcin ;
Jaeger, Natalie ;
Jones, David T. W. ;
Jones, David ;
Knappskog, Stian ;
Kool, Marcel ;
Lakhani, Sunil R. ;
Lopez-Otin, Carlos ;
Martin, Sancha ;
Munshi, Nikhil C. ;
Nakamura, Hiromi ;
Northcott, Paul A. ;
Pajic, Marina ;
Papaemmanuil, Elli ;
Paradiso, Angelo ;
Pearson, John V. ;
Puente, Xose S. ;
Raine, Keiran ;
Ramakrishna, Manasa ;
Richardson, Andrea L. ;
Richter, Julia ;
Rosenstiel, Philip ;
Schlesner, Matthias ;
Schumacher, Ton N. ;
Span, Paul N. ;
Teague, Jon W. .
NATURE, 2013, 500 (7463) :415-+
[2]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[3]   DNA METHYLATION AND THE FREQUENCY OF CPG IN ANIMAL DNA [J].
BIRD, AP .
NUCLEIC ACIDS RESEARCH, 1980, 8 (07) :1499-1504
[4]  
BULMER M, 1986, MOL BIOL EVOL, V3, P322
[5]  
COOPER DN, 1990, HUM GENET, V85, P55
[6]   Fast Computation and Applications of Genome Mappability [J].
Derrien, Thomas ;
Estelle, Jordi ;
Marco Sola, Santiago ;
Knowles, David G. ;
Raineri, Emanuele ;
Guigo, Roderic ;
Ribeca, Paolo .
PLOS ONE, 2012, 7 (01)
[7]   How Much of the Variation in the Mutation Rate Along the Human Genome Can Be Explained? [J].
Eyre-Walker, Adam ;
Eyre-Walker, Ying Chen .
G3-GENES GENOMES GENETICS, 2014, 4 (09) :1667-1670
[8]   Ensembl 2012 [J].
Flicek, Paul ;
Amode, M. Ridwan ;
Barrell, Daniel ;
Beal, Kathryn ;
Brent, Simon ;
Carvalho-Silva, Denise ;
Clapham, Peter ;
Coates, Guy ;
Fairley, Susan ;
Fitzgerald, Stephen ;
Gil, Laurent ;
Gordon, Leo ;
Hendrix, Maurice ;
Hourlier, Thibaut ;
Johnson, Nathan ;
Kaehaeri, Andreas K. ;
Keefe, Damian ;
Keenan, Stephen ;
Kinsella, Rhoda ;
Komorowska, Monika ;
Koscielny, Gautier ;
Kulesha, Eugene ;
Larsson, Pontus ;
Longden, Ian ;
McLaren, William ;
Muffato, Matthieu ;
Overduin, Bert ;
Pignatelli, Miguel ;
Pritchard, Bethan ;
Riat, Harpreet Singh ;
Ritchie, Graham R. S. ;
Ruffier, Magali ;
Schuster, Michael ;
Sobral, Daniel ;
Tang, Y. Amy ;
Taylor, Kieron ;
Trevanion, Stephen ;
Vandrovcova, Jana ;
White, Simon ;
Wilson, Mark ;
Wilder, Steven P. ;
Aken, Bronwen L. ;
Birney, Ewan ;
Cunningham, Fiona ;
Dunham, Ian ;
Durbin, Richard ;
Fernandez-Suarez, Xose M. ;
Harrow, Jennifer ;
Herrero, Javier ;
Hubbard, Tim J. P. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D84-D90
[9]   Genome-wide patterns and properties of de novo mutations in humans [J].
Francioli, Laurent C. ;
Polak, Paz P. ;
Koren, Amnon ;
Menelaou, Androniki ;
Chun, Sung ;
Renkens, Ivo ;
van Duijn, Cornelia M. ;
Swertz, Morris ;
Wijmenga, Cisca ;
van Ommen, Gertjan ;
Slagboom, P. Eline ;
Boomsma, Dorret I. ;
Ye, Kai ;
Guryev, Victor ;
Arndt, Peter F. ;
Kloosterman, Wigard P. ;
de Bakker, Paul I. W. ;
Sunyaev, Shamil R. .
NATURE GENETICS, 2015, 47 (07) :822-+
[10]   CpG mutation rates in the human genome are highly dependent on local GC content [J].
Fryxell, KJ ;
Moon, WJ .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (03) :650-658