Assessment of computational methods for predicting the effects of missense mutations in human cancers

被引:134
作者
Gnad, Florian [1 ]
Baucom, Albion [1 ]
Mukhyala, Kiran [1 ]
Manning, Gerard [1 ]
Zhang, Zemin [1 ]
机构
[1] Genentech Inc, Dept Bioinformat & Computat Biol, San Francisco, CA 94080 USA
关键词
SOMATIC MUTATIONS; PROTEIN FUNCTION; DATABASE; SIFT; PHOSPHORYLATION; SUBSTITUTIONS; DISEASE;
D O I
10.1186/1471-2164-14-S3-S7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Recent advances in sequencing technologies have greatly increased the identification of mutations in cancer genomes. However, it remains a significant challenge to identify cancer-driving mutations, since most observed missense changes are neutral passenger mutations. Various computational methods have been developed to predict the effects of amino acid substitutions on protein function and classify mutations as deleterious or benign. These include approaches that rely on evolutionary conservation, structural constraints, or physicochemical attributes of amino acid substitutions. Here we review existing methods and further examine eight tools: SIFT, PolyPhen2, Condel, CHASM, mCluster, logRE, SNAP, and MutationAssessor, with respect to their coverage, accuracy, availability and dependence on other tools. Results: Single nucleotide polymorphisms with high minor allele frequencies were used as a negative (neutral) set for testing, and recurrent mutations from the COSMIC database as well as novel recurrent somatic mutations identified in very recent cancer studies were used as positive (non-neutral) sets. Conservation-based methods generally had moderately high accuracy in distinguishing neutral from deleterious mutations, whereas the performance of machine learning based predictors with comprehensive feature spaces varied between assessments using different positive sets. MutationAssessor consistently provided the highest accuracies. For certain combinations metapredictors slightly improved the performance of included individual methods, but did not outperform MutationAssessor as stand-alone tool. Conclusions: Our independent assessment of existing tools reveals various performance disparities. Cancer-trained methods did not improve upon more general predictors. No method or combination of methods exceeds 81% accuracy, indicating there is still significant room for improvement for driver mutation prediction, and perhaps more sophisticated feature integration is needed to develop a more robust tool.
引用
收藏
页数:13
相关论文
共 35 条
[1]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[4]   Sequence analysis of mutations and translocations across breast cancer subtypes [J].
Banerji, Shantanu ;
Cibulskis, Kristian ;
Rangel-Escareno, Claudia ;
Brown, Kristin K. ;
Carter, Scott L. ;
Frederick, Abbie M. ;
Lawrence, Michael S. ;
Sivachenko, Andrey Y. ;
Sougnez, Carrie ;
Zou, Lihua ;
Cortes, Maria L. ;
Fernandez-Lopez, Juan C. ;
Peng, Shouyong ;
Ardlie, Kristin G. ;
Auclair, Daniel ;
Bautista-Pina, Veronica ;
Duke, Fujiko ;
Francis, Joshua ;
Jung, Joonil ;
Maffuz-Aziz, Antonio ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Pho, Nam H. ;
Quintanar-Jurado, Valeria ;
Ramos, Alex H. ;
Rebollar-Vega, Rosa ;
Rodriguez-Cuevas, Sergio ;
Romero-Cordoba, Sandra L. ;
Schumacher, Steven E. ;
Stransky, Nicolas ;
Thompson, Kristin M. ;
Uribe-Figueroa, Laura ;
Baselga, Jose ;
Beroukhim, Rameen ;
Polyak, Kornelia ;
Sgroi, Dennis C. ;
Richardson, Andrea L. ;
Jimenez-Sanchez, Gerardo ;
Lander, Eric S. ;
Gabriel, Stacey B. ;
Garraway, Levi A. ;
Golub, Todd R. ;
Melendez-Zajgla, Jorge ;
Toker, Alex ;
Getz, Gad ;
Hidalgo-Miranda, Alfredo ;
Meyerson, Matthew .
NATURE, 2012, 486 (7403) :405-409
[5]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[6]   ProPhylER: A curated online resource for protein function and structure based on evolutionary constraint analyses [J].
Binkley, Jonathan ;
Karra, Kalpana ;
Kirby, Andrew ;
Hosobuchi, Midori ;
Stone, Eric A. ;
Sidow, Arend .
GENOME RESEARCH, 2010, 20 (01) :142-154
[7]   SNAP predicts effect of mutations on protein function [J].
Bromberg, Yana ;
Yachdav, Guy ;
Rost, Burkhard .
BIOINFORMATICS, 2008, 24 (20) :2397-2398
[8]   SNAP: predict effect of non-synonymous polymorphisms on function [J].
Bromberg, Yana ;
Rost, Burkhard .
NUCLEIC ACIDS RESEARCH, 2007, 35 (11) :3823-3835
[9]   Cancer-Specific High-Throughput Annotation of Somatic Mutations: Computational Prediction of Driver Missense Mutations [J].
Carter, Hannah ;
Chen, Sining ;
Isik, Leyla ;
Tyekucheva, Svitlana ;
Velculescu, Victor E. ;
Kinzler, Kenneth W. ;
Vogelstein, Bert ;
Karchin, Rachel .
CANCER RESEARCH, 2009, 69 (16) :6660-6667
[10]   Comprehensive genomic characterization defines human glioblastoma genes and core pathways [J].
Chin, L. ;
Meyerson, M. ;
Aldape, K. ;
Bigner, D. ;
Mikkelsen, T. ;
VandenBerg, S. ;
Kahn, A. ;
Penny, R. ;
Ferguson, M. L. ;
Gerhard, D. S. ;
Getz, G. ;
Brennan, C. ;
Taylor, B. S. ;
Winckler, W. ;
Park, P. ;
Ladanyi, M. ;
Hoadley, K. A. ;
Verhaak, R. G. W. ;
Hayes, D. N. ;
Spellman, Paul T. ;
Absher, D. ;
Weir, B. A. ;
Ding, L. ;
Wheeler, D. ;
Lawrence, M. S. ;
Cibulskis, K. ;
Mardis, E. ;
Zhang, Jinghui ;
Wilson, R. K. ;
Donehower, L. ;
Wheeler, D. A. ;
Purdom, E. ;
Wallis, J. ;
Laird, P. W. ;
Herman, J. G. ;
Schuebel, K. E. ;
Weisenberger, D. J. ;
Baylin, S. B. ;
Schultz, N. ;
Yao, Jun ;
Wiedemeyer, R. ;
Weinstein, J. ;
Sander, C. ;
Gibbs, R. A. ;
Gray, J. ;
Kucherlapati, R. ;
Lander, E. S. ;
Myers, R. M. ;
Perou, C. M. ;
McLendon, Roger .
NATURE, 2008, 455 (7216) :1061-1068