AI-Driven Deep Learning Techniques in Protein Structure Prediction

被引:30
作者
Chen, Lingtao [1 ]
Li, Qiaomu [1 ]
Nasif, Kazi Fahim Ahmad [1 ]
Xie, Ying [1 ]
Deng, Bobin [1 ]
Niu, Shuteng [2 ]
Pouriyeh, Seyedamin [1 ]
Dai, Zhiyu [3 ]
Chen, Jiawei [4 ]
Xie, Chloe Yixin [1 ]
机构
[1] Kennesaw State Univ, Coll Comp & Software Engn, Marietta, GA 30060 USA
[2] Bowling Green State Univ, Dept Comp Sci, Bowling Green, OH 43403 USA
[3] Washington Univ, John T Milliken Dept Med, Div Pulm & Crit Care Med, Sch Med St Louis, St Louis, MO 63110 USA
[4] Univ Calif Berkeley, Div Comp Data Sci & Soc, Berkeley, CA 94720 USA
关键词
protein structure; computational methods; artificial intelligence; machine learning; deep learning; transformer; AlphaFold; protein modeling; bioinformatics; healthcare; META-THREADING-SERVER; HOMOLOGY DETECTION; FOLD-RECOGNITION; SWISS-MODEL; GENERATION; ALIGNMENT; ACCURATE; TARGETS; QUALITY; BIOLOGY;
D O I
10.3390/ijms25158426
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
引用
收藏
页数:21
相关论文
共 131 条
[1]   Rosetta and the Journey to Predict Proteins' Structures, 20 Years on [J].
Abbass, Jad ;
Nebel, Jean-Christophe .
CURRENT BIOINFORMATICS, 2020, 15 (06) :611-628
[2]   Accurate structure prediction of biomolecular interactions with AlphaFold 3 [J].
Abramson, Josh ;
Adler, Jonas ;
Dunger, Jack ;
Evans, Richard ;
Green, Tim ;
Pritzel, Alexander ;
Ronneberger, Olaf ;
Willmore, Lindsay ;
Ballard, Andrew J. ;
Bambrick, Joshua ;
Bodenstein, Sebastian W. ;
Evans, David A. ;
Hung, Chia-Chun ;
O'Neill, Michael ;
Reiman, David ;
Tunyasuvunakool, Kathryn ;
Wu, Zachary ;
Zemgulyte, Akvile ;
Arvaniti, Eirini ;
Beattie, Charles ;
Bertolli, Ottavia ;
Bridgland, Alex ;
Cherepanov, Alexey ;
Congreve, Miles ;
Cowen-Rivers, Alexander I. ;
Cowie, Andrew ;
Figurnov, Michael ;
Fuchs, Fabian B. ;
Gladman, Hannah ;
Jain, Rishub ;
Khan, Yousuf A. ;
Low, Caroline M. R. ;
Perlin, Kuba ;
Potapenko, Anna ;
Savy, Pascal ;
Singh, Sukhdeep ;
Stecula, Adrian ;
Thillaisundaram, Ashok ;
Tong, Catherine ;
Yakneen, Sergei ;
Zhong, Ellen D. ;
Zielinski, Michal ;
Zidek, Augustin ;
Bapst, Victor ;
Kohli, Pushmeet ;
Jaderberg, Max ;
Hassabis, Demis ;
Jumper, John M. .
NATURE, 2024, 630 (8016) :493-500
[3]   OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization [J].
Ahdritz, Gustaf ;
Bouatta, Nazim ;
Floristean, Christina ;
Kadyan, Sachin ;
Xia, Qinghui ;
Gerecke, William ;
O'Donnell, Timothy J. ;
Berenberg, Daniel ;
Fisk, Ian ;
Zanichelli, Niccolo ;
Zhang, Bo ;
Nowaczynski, Arkadiusz ;
Wang, Bei ;
Stepniewska-Dziubinska, Marta M. ;
Zhang, Shang ;
Ojewole, Adegoke ;
Guney, Murat Efe ;
Biderman, Stella ;
Watkins, Andrew M. ;
Ra, Stephen ;
Lorenzo, Pablo Ribalta ;
Nivon, Lucas ;
Weitzner, Brian ;
Ban, Yih-En Andrew ;
Chen, Shiyang ;
Zhang, Minjia ;
Li, Conglong ;
Song, Shuaiwen Leon ;
He, Yuxiong ;
Sorger, Peter K. ;
Mostaque, Emad ;
Zhang, Zhao ;
Bonneau, Richard ;
AlQuraishi, Mohammed .
NATURE METHODS, 2024, 21 (08) :1514-1524
[4]  
Albawi S, 2017, I C ENG TECHNOL
[5]  
Alkharusi H., 2012, Int. J. Educ, V4, P202, DOI [DOI 10.5296/IJE.V4I2.1962, 10.5296/ije.v4i2.1962]
[6]   AlphaFold at CASP13 [J].
AlQuraishi, Mohammed .
BIOINFORMATICS, 2019, 35 (22) :4862-4865
[7]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[8]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[9]  
Anand N, 2018, ADV NEUR IN, V31
[10]   De novo protein design by deep network hallucination [J].
Anishchenko, Ivan ;
Pellock, Samuel J. ;
Chidyausiku, Tamuka M. ;
Ramelot, Theresa A. ;
Ovchinnikov, Sergey ;
Hao, Jingzhou ;
Bafna, Khushboo ;
Norn, Christoffer ;
Kang, Alex ;
Bera, Asim K. ;
DiMaio, Frank ;
Carter, Lauren ;
Chow, Cameron M. ;
Montelione, Gaetano T. ;
Baker, David .
NATURE, 2021, 600 (7889) :547-+