Accurate prediction of functional effect of single amino acid variants with deep learning

被引:8
作者
Derbel, Houssemeddine [1 ]
Zhao, Zhongming [2 ]
Liu, Qian [1 ,3 ]
机构
[1] Univ Nevada, Nevada Inst Personalized Med, Las Vegas, NV 89154 USA
[2] Univ Texas Hlth Sci Ctr Houston, Ctr Precis Hlth, McWilliams Sch Biomed Informat, Houston, TX 77030 USA
[3] Univ Nevada, Coll Sci, Sch Life Sci, Las Vegas, NV 89154 USA
基金
美国国家卫生研究院;
关键词
Functional effect; Deep learning; Single amino acid variant; Precise estimation; High-throughput experiments; PROTEIN; LANDSCAPE; SEQUENCE; FITNESS; MUTATIONS; DOMAIN;
D O I
10.1016/j.csbj.2023.11.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The assessment of functional effect of amino acid variants is a critical biological problem in proteomics for clinical medicine and protein engineering. Although natively occurring variants offer insights into deleterious variants, high-throughput deep mutational experiments enable comprehensive investigation of amino acid variants for a given protein. However, these mutational experiments are too expensive to dissect millions of variants on thousands of proteins. Thus, computational approaches have been proposed, but they heavily rely on hand-crafted evolutionary conservation, limiting their accuracy. Recent advancement in transformers provides a promising solution to precisely estimate the functional effects of protein variants on high-throughput experimental data. Here, we introduce a novel deep learning model, namely Rep2Mut-V2, which leverages learned representation from transformer models. Rep2Mut-V2 significantly enhances the prediction accuracy for 27 types of measurements of functional effects of protein variants. In the evaluation of 38 protein datasets with 118,933 single amino acid variants, Rep2Mut-V2 achieved an average Spearman's correlation coefficient of 0.7. This surpasses the performance of six state-of-the-art methods, including the recently released methods ESM, DeepSequence and EVE. Even with limited training data, Rep2Mut-V2 outperforms ESM and DeepSequence, showing its potential to extend high-throughput experimental analysis for more protein variants to reduce experimental cost. In conclusion, Rep2Mut-V2 provides accurate predictions of the functional effects of single amino acid variants of protein coding sequences. This tool can significantly aid in the interpretation of variants in human disease studies.
引用
收藏
页码:5776 / 5784
页数:9
相关论文
共 48 条
[1]   Evolving New Protein-Protein Interaction Specificity through Promiscuous Intermediates [J].
Aakre, Christopher D. ;
Herrou, Julien ;
Phung, Tuyen N. ;
Perchuk, Barrett S. ;
Crosson, Sean ;
Laub, Michael T. .
CELL, 2015, 163 (03) :594-606
[2]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[3]   A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function [J].
Araya, Carlos L. ;
Fowler, Douglas M. ;
Chen, Wentao ;
Muniez, Ike ;
Kelly, Jeffery W. ;
Fields, Stanley .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (42) :16858-16863
[4]   GigaAssay - An adaptable high-throughput saturation mutagenesis assay platform [J].
Benjamin, Ronald ;
Giacoletto, Christopher J. ;
FitzHugh, Zachary T. ;
Eames, Danielle ;
Buczek, Lindsay ;
Wu, Xiaogang ;
Newsome, Jacklyn ;
Han, Mira, V ;
Pearson, Tony ;
Wei, Zhi ;
Banerjee, Atoshi ;
Brown, Lancer ;
Valente, Liz J. ;
Shen, Shirley ;
Deng, Hong-Wen ;
Schiller, Martin R. .
GENOMICS, 2022, 114 (04)
[5]   Deep Sequencing of Systematic Combinatorial Libraries Reveals β-Lactamase Sequence Constraints at High Resolution [J].
Deng, Zhifeng ;
Huang, Wanzhi ;
Bakkalbasi, Erol ;
Brown, Nicholas G. ;
Adamski, Carolyn J. ;
Rice, Kacie ;
Muzny, Donna ;
Gibbs, Richard A. ;
Palzkill, Timothy .
JOURNAL OF MOLECULAR BIOLOGY, 2012, 424 (3-4) :150-167
[6]   Accurate Prediction of Transcriptional Activity of Single Missense Variants in HIV Tat with Deep Learning [J].
Derbel, Houssemeddine ;
Giacoletto, Christopher J. J. ;
Benjamin, Ronald ;
Chen, Gordon R. ;
Schiller, Martin R. R. ;
Liu, Qian .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (07)
[7]   Accurate Measurement of the Effects of All Amino-Acid Mutations on Influenza Hemagglutinin [J].
Doud, Michael B. ;
Bloom, Jesse D. .
VIRUSES-BASEL, 2016, 8 (06)
[8]   Coevolutionary Landscape Inference and the Context-Dependence of Mutations in Beta-Lactamase TEM-1 [J].
Figliuzzi, Matteo ;
Jacquier, Herve ;
Schug, Alexander ;
Tenaillon, Oliver ;
Weigt, Martin .
MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (01) :268-280
[9]   A Comprehensive, High-Resolution Map of a Gene's Fitness Landscape [J].
Firnberg, Elad ;
Labonte, Jason W. ;
Gray, Jeffrey J. ;
Ostermeier, Marc .
MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (06) :1581-1592
[10]  
Fowler DM, 2014, NAT METHODS, V11, P801, DOI [10.1038/nmeth.3027, 10.1038/NMETH.3027]