iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers

被引:20
作者
Wang, Meng [1 ]
Wei, Liping [1 ]
机构
[1] Peking Univ, Sch Life Sci, Ctr Bioinformat, State Key Lab Prot & Plant Gene Res, Beijing, Peoples R China
来源
SCIENTIFIC REPORTS | 2016年 / 6卷
基金
中国国家自然科学基金;
关键词
UNKNOWN CLINICAL-SIGNIFICANCE; AMINO-ACID SUBSTITUTIONS; SEQUENCE VARIANTS; MISSENSE VARIANTS; PROTEIN FUNCTION; DISEASE; MUTATIONS; CONSTRAINT; FRAMEWORK; DATABASE;
D O I
10.1038/srep31321
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate prediction of the pathogenicity of genomic variants, especially nonsynonymous single nucleotide variants (nsSNVs), is essential in biomedical research and clinical genetics. Most current prediction methods build a generic classifier for all genes. However, different genes and gene families have different features. We investigated whether gene-specific and family-specific customized classifiers could improve prediction accuracy. Customized gene-specific and family-specific attributes were selected with AIC, BIC, and LASSO, and Support Vector Machine classifiers were generated for 254 genes and 152 gene families, covering a total of 5,985 genes. Our results showed that the customized attributes reflected key features of the genes and gene families, and the customized classifiers achieved higher prediction accuracy than the generic classifier. The customized classifiers and the generic classifier for other genes and families were integrated into a new tool named iFish (integrated Functional inference of SNVs in human, http://ifish.cbi.pku.edu.cn). iFish outperformed other methods on benchmark datasets as well as on prioritization of candidate causal variants from whole exome sequencing. iFish provides a user-friendly web-based interface and supports other functionalities such as integration of genetic evidence. iFish would facilitate high-throughput evaluation and prioritization of nsSNVs in human genetics research.
引用
收藏
页数:10
相关论文
共 48 条
  • [1] A method and server for predicting damaging missense mutations
    Adzhubei, Ivan A.
    Schmidt, Steffen
    Peshkin, Leonid
    Ramensky, Vasily E.
    Gerasimova, Anna
    Bork, Peer
    Kondrashov, Alexey S.
    Sunyaev, Shamil R.
    [J]. NATURE METHODS, 2010, 7 (04) : 248 - 249
  • [2] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [3] McKusick's Online Mendelian Inheritance in Man (OMIM®)
    Amberger, Joanna
    Bocchini, Carol A.
    Scott, Alan F.
    Hamosh, Ada
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D793 - D796
  • [4] [Anonymous], 2015, Nature, DOI [DOI 10.1038/NATURE15393, 10.1038/nature15393]
  • [5] [Anonymous], 2012, Nature
  • [6] [Anonymous], 2011, ACM T INTEL SYST TEC, DOI DOI 10.1145/1961189.1961199
  • [7] Exome sequencing as a tool for Mendelian disease gene discovery
    Bamshad, Michael J.
    Ng, Sarah B.
    Bigham, Abigail W.
    Tabor, Holly K.
    Emond, Mary J.
    Nickerson, Deborah A.
    Shendure, Jay
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (11) : 745 - 755
  • [8] A novel classification system to predict the pathogenic effects of CHD7 missense variants in CHARGE syndrome
    Bergman, Jorieke E. H.
    Janssen, Nicole
    van der Sloot, Almer M.
    de Walle, Hermien E. K.
    Schoots, Jeroen
    Rendtorff, Nanna D.
    Tranebjaerg, Lisbeth
    Hoefsloot, Lies H.
    van Ravenswaaij-Arts, Conny M. A.
    Hofstra, Robert M. W.
    [J]. HUMAN MUTATION, 2012, 33 (08) : 1251 - 1260
  • [9] Functional Annotations Improve the Predictive Score of Human Disease-Related Mutations in Proteins
    Calabrese, Remo
    Capriotti, Emidio
    Fariselli, Piero
    Martelli, Pier Luigi
    Casadio, Rita
    [J]. HUMAN MUTATION, 2009, 30 (08) : 1237 - 1244
  • [10] Distribution and intensity of constraint in mammalian genomic sequence
    Cooper, GM
    Stone, EA
    Asimenos, G
    Green, ED
    Batzoglou, S
    Sidow, A
    [J]. GENOME RESEARCH, 2005, 15 (07) : 901 - 913