DPFunc: accurately predicting protein function via deep learning with domain-guided structure information

被引:1
作者
Wang, Wenkang [1 ]
Shuai, Yunyan [1 ]
Zeng, Min [1 ]
Fan, Wei [2 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[2] Univ Oxford, Nuffield Dept Womens & Reprod Hlth, Oxford OX3 9DU, England
基金
中国国家自然科学基金;
关键词
COMPLETE GENOME; LARGE-SCALE; SEQUENCE; ENZYME; DATABASE; IDENTIFICATION; PROFILES; COFACTOR;
D O I
10.1038/s41467-024-54816-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Computational methods for predicting protein function are of great significance in understanding biological mechanisms and treating complex diseases. However, existing computational approaches of protein function prediction lack interpretability, making it difficult to understand the relations between protein structures and functions. In this study, we propose a deep learning-based solution, named DPFunc, for accurate protein function prediction with domain-guided structure information. DPFunc can detect significant regions in protein structures and accurately predict corresponding functions under the guidance of domain information. It outperforms current state-of-the-art methods and achieves a significant improvement over existing structure-based methods. Detailed analyses demonstrate that the guidance of domain information contributes to DPFunc for protein function prediction, enabling our method to detect key residues or regions in protein structures, which are closely related to their functions. In summary, DPFunc serves as an effective tool for large-scale protein function prediction, which pushes the border of protein understanding in biological systems.
引用
收藏
页数:13
相关论文
共 78 条
  • [1] Accurate structure prediction of biomolecular interactions with AlphaFold 3
    Abramson, Josh
    Adler, Jonas
    Dunger, Jack
    Evans, Richard
    Green, Tim
    Pritzel, Alexander
    Ronneberger, Olaf
    Willmore, Lindsay
    Ballard, Andrew J.
    Bambrick, Joshua
    Bodenstein, Sebastian W.
    Evans, David A.
    Hung, Chia-Chun
    O'Neill, Michael
    Reiman, David
    Tunyasuvunakool, Kathryn
    Wu, Zachary
    Zemgulyte, Akvile
    Arvaniti, Eirini
    Beattie, Charles
    Bertolli, Ottavia
    Bridgland, Alex
    Cherepanov, Alexey
    Congreve, Miles
    Cowen-Rivers, Alexander I.
    Cowie, Andrew
    Figurnov, Michael
    Fuchs, Fabian B.
    Gladman, Hannah
    Jain, Rishub
    Khan, Yousuf A.
    Low, Caroline M. R.
    Perlin, Kuba
    Potapenko, Anna
    Savy, Pascal
    Singh, Sukhdeep
    Stecula, Adrian
    Thillaisundaram, Ashok
    Tong, Catherine
    Yakneen, Sergei
    Zhong, Ellen D.
    Zielinski, Michal
    Zidek, Augustin
    Bapst, Victor
    Kohli, Pushmeet
    Jaderberg, Max
    Hassabis, Demis
    Jumper, John M.
    [J]. NATURE, 2024, 630 (8016) : 493 - 500
  • [2] NetQuilt: deep multispecies network-based protein function prediction using homology-informed network similarity
    Barot, Meet
    Gligorijevic, Vladimir
    Cho, Kyunghyun
    Bonneau, Richard
    [J]. BIOINFORMATICS, 2021, 37 (16) : 2414 - 2422
  • [3] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [4] A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE
    BOWIE, JU
    LUTHY, R
    EISENBERG, D
    [J]. SCIENCE, 1991, 253 (5016) : 164 - 170
  • [5] Fast and sensitive protein alignment using DIAMOND
    Buchfink, Benjamin
    Xie, Chao
    Huson, Daniel H.
    [J]. NATURE METHODS, 2015, 12 (01) : 59 - 60
  • [6] Burley SK, 2017, METHODS MOL BIOL, V1606, P627, DOI 10.1007/978-1-4939-7000-1_26
  • [7] An evolutionarily structured universe of protein architecture
    Caetano-Anollés, G
    Caetano-Anollés, D
    [J]. GENOME RESEARCH, 2003, 13 (07) : 1563 - 1571
  • [8] Discovering functionally important sites in proteins
    Cagiada, Matteo
    Bottaro, Sandro
    Lindemose, Soren
    Schenstrom, Signe M.
    Stein, Amelie
    Hartmann-Petersen, Rasmus
    Lindorff-Larsen, Kresten
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [9] TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding
    Cao, Yue
    Shen, Yang
    [J]. BIOINFORMATICS, 2021, 37 (18) : 2825 - 2833
  • [10] The Gene Ontology Resource: 20 years and still GOing strong
    Carbon, S.
    Douglass, E.
    Dunn, N.
    Good, B.
    Harris, N. L.
    Lewis, S. E.
    Mungall, C. J.
    Basu, S.
    Chisholm, R. L.
    Dodson, R. J.
    Hartline, E.
    Fey, P.
    Thomas, P. D.
    Albou, L. P.
    Ebert, D.
    Kesling, M. J.
    Mi, H.
    Muruganujian, A.
    Huang, X.
    Poudel, S.
    Mushayahama, T.
    Hu, J. C.
    LaBonte, S. A.
    Siegele, D. A.
    Antonazzo, G.
    Attrill, H.
    Brown, N. H.
    Fexova, S.
    Garapati, P.
    Jones, T. E. M.
    Marygold, S. J.
    Millburn, G. H.
    Rey, A. J.
    Trovisco, V.
    dos Santos, G.
    Emmert, D. B.
    Falls, K.
    Zhou, P.
    Goodman, J. L.
    Strelets, V. B.
    Thurmond, J.
    Courtot, M.
    Osumi-Sutherland, D.
    Parkinson, H.
    Roncaglia, P.
    Acencio, M. L.
    Kuiper, M.
    Laegreid, A.
    Logie, C.
    Lovering, R. C.
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D330 - D338