Topology-based protein classification: A deep learning approach

被引:0
作者
Hashemi, Aliye Sadat [1 ]
Vaisman, Iosif I. [1 ]
机构
[1] George Mason Univ, Sch Syst Biol, Manassas, VA 20110 USA
关键词
Protein classification; Delaunay tessellation; Deep learning; Topology; Protein superfamily; Machine learning; TOOL; TESSELLATION;
D O I
10.1016/j.bbrc.2024.151240
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Utilizing Artificial Intelligence (AI) in computational biology techniques could offer significant advantages in alleviating the growing workloads faced by structural biologists, especially with the emergence of big data. In this study, we employed Delaunay tessellation as a promising method to obtain the overall structural topology of proteins. Subsequently, we developed multi-class deep neural network models to classify protein superfamilies based on their local topology. Our models achieved a test accuracy of approximately 0.92 in classifying proteins into 18 well-populated superfamilies. We believe that the results of this study hold substantial value since, to the best of our knowledge, no previous studies have reported the utilization of protein topological data for protein classification through deep learning and Delaunay tessellation.
引用
收藏
页数:9
相关论文
共 26 条
[1]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[2]   The Quickhull algorithm for convex hulls [J].
Barber, CB ;
Dobkin, DP ;
Huhdanpaa, H .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1996, 22 (04) :469-483
[3]   A new topological method to measure protein structure similarity [J].
Bostick, D ;
Vaisman, II .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2003, 304 (02) :320-325
[4]   A simple topological representation of protein structure: Implications for new, fast, and robust structural classification [J].
Bostick, DL ;
Shen, M ;
Vaisman, II .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 56 (03) :487-501
[5]   Characterizing the regularity of tetrahedral packing motifs in protein tertiary structure [J].
Day, Ryan ;
Lennox, Kristin P. ;
Dahl, David B. ;
Vannucci, Marina ;
Tsai, Jerry W. .
BIOINFORMATICS, 2010, 26 (24) :3059-3066
[6]  
Draizen EJ, 2024, bioRxiv, DOI [10.1101/2022.07.29.501943, 10.1101/2022.07.29.501943]
[7]   Geometricus represents protein structures as shape-mers derived from moment invariants [J].
Durairaj, Janani ;
Akdel, Mehmet ;
de Ridder, Dick ;
van Dijk, Aalt D. J. .
BIOINFORMATICS, 2020, 36 :I718-I725
[8]   PROVAT: a tool for Voronoi tessellation analysis of protein structures and complexes [J].
Gore, SP ;
Burke, DF ;
Blundell, TL .
BIOINFORMATICS, 2005, 21 (15) :3316-3317
[9]   Highly accurate protein structure prediction with AlphaFold [J].
Jumper, John ;
Evans, Richard ;
Pritzel, Alexander ;
Green, Tim ;
Figurnov, Michael ;
Ronneberger, Olaf ;
Tunyasuvunakool, Kathryn ;
Bates, Russ ;
Zidek, Augustin ;
Potapenko, Anna ;
Bridgland, Alex ;
Meyer, Clemens ;
Kohl, Simon A. A. ;
Ballard, Andrew J. ;
Cowie, Andrew ;
Romera-Paredes, Bernardino ;
Nikolov, Stanislav ;
Jain, Rishub ;
Adler, Jonas ;
Back, Trevor ;
Petersen, Stig ;
Reiman, David ;
Clancy, Ellen ;
Zielinski, Michal ;
Steinegger, Martin ;
Pacholska, Michalina ;
Berghammer, Tamas ;
Bodenstein, Sebastian ;
Silver, David ;
Vinyals, Oriol ;
Senior, Andrew W. ;
Kavukcuoglu, Koray ;
Kohli, Pushmeet ;
Hassabis, Demis .
NATURE, 2021, 596 (7873) :583-+
[10]   Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions [J].
Krissinel, E ;
Henrick, K .
ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2004, 60 :2256-2268