The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction

被引:1
作者
Zhang, Chenyue [1 ]
Wang, Qinxin [2 ]
Li, Yiyang [1 ]
Teng, Anqi [3 ]
Hu, Gang [1 ]
Wuyun, Qiqige [4 ]
Zheng, Wei [1 ,5 ]
机构
[1] Nankai Univ, Sch Stat & Data Sci, NITFID, LPMC & KLMDASR, Tianjin 300071, Peoples R China
[2] Suzhou New & High Tech Innovat Serv Ctr, Suzhou 215011, Peoples R China
[3] Hong Kong Univ Sci & Technol Guangzhou, Biosci & Biomed Engn Thrust, Syst Hub, Guangzhou 511453, Peoples R China
[4] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[5] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
pairwise sequence alignment; multiple sequence alignment; protein monomer; protein complex; RNA; protein language model; function prediction; protein structure prediction; deep learning; PROTEIN HOMOLOGY DETECTION; META-THREADING-SERVER; SUBSTITUTION MATRICES; BINDING RESIDUES; NUCLEIC-ACID; WEB SERVER; SEARCH; IDENTIFICATION; RECOGNITION; CONTACTS;
D O I
10.3390/biom14121531
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignment (MSA) has evolved into a fundamental tool in the biological sciences, playing a pivotal role in predicting molecular structures and functions. With broad applications in protein and nucleic acid modeling, MSAs continue to underpin advancements across a range of disciplines. MSAs are not only foundational for traditional sequence comparison techniques but also increasingly important in the context of artificial intelligence (AI)-driven advancements. Recent breakthroughs in AI, particularly in protein and nucleic acid structure prediction, rely heavily on the accuracy and efficiency of MSAs to enhance remote homology detection and guide spatial restraints. This review traces the historical evolution of MSA, highlighting its significance in molecular structure and function prediction. We cover the methodologies used for protein monomers, protein complexes, and RNA, while also exploring emerging AI-based alternatives, such as protein language models, as complementary or replacement approaches to traditional MSAs in application tasks. By discussing the strengths, limitations, and applications of these methods, this review aims to provide researchers with valuable insights into MSA's evolving role, equipping them to make informed decisions in structural prediction research.
引用
收藏
页数:37
相关论文
共 166 条
  • [1] DNCON2: improved protein contact prediction using two-level deep convolutional neural networks
    Adhikari, Badri
    Hou, Jie
    Cheng, Jianlin
    [J]. BIOINFORMATICS, 2018, 34 (09) : 1466 - 1472
  • [2] CONFOLD2: improved contact-driven ab initio protein structure modeling
    Adhikari, Badri
    Cheng, Jianlin
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [3] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [4] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [5] [Anonymous], 2007, Nucleic Acids Res, V35, pD193, DOI DOI 10.1093/NAR/GKL929
  • [6] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
  • [7] Accurate prediction of protein-nucleic acid complexes using RoseTTAFoldNA
    Baek, Minkyung
    Mchugh, Ryan
    Anishchenko, Ivan
    Jiang, Hanlun
    Baker, David
    DiMaio, Frank
    [J]. NATURE METHODS, 2024, 21 (01) : 117 - 121
  • [8] Barrett C, 1997, COMPUT APPL BIOSCI, V13, P191
  • [9] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [10] Inferring interaction partners from protein sequences
    Bitbol, Anne-Florence
    Dwyer, Robert S.
    Colwell, Lucy J.
    Wingreen, Ned S.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (43) : 12180 - 12185