Decision tree classifier based on topological characteristics of subgraph for the mining of protein complexes from large scale PPI networks

被引:6
作者
Sahoo, Tushar Ranjan [1 ]
Patra, Sabyasachi [1 ]
Vipsita, Swati [1 ]
机构
[1] IIIT, Dept Comp Sci, Bioinformat Lab, Bhubaneswar, India
关键词
Protein complex; PPI network; Clustering; Cluster density; Graph; Decision tree classifier; IDENTIFICATION; EFFICIENT; DATABASE; MODULES;
D O I
10.1016/j.compbiolchem.2023.107935
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational features of protein protein interaction (PPI) networks. Clustering PPI networks has proven useful in numerous research over the past two decades for identifying functional modules, understanding the roles of previously unknown proteins, and other purposes. Protein complexes represent one of the essential cellular components for creating biological activities. Inferring protein complexes has been made more accessible by experimental approaches. We offer a novel method that integrates the classification model with local topological data, making it more reliable and efficient. This article describes a decision tree classifier based on topological characteristics of the subgraph for mining protein complexes. The proposed graph-based algorithm is an effective and efficient way to identify protein complexes from large-scale PPI networks. The performance of the proposed algorithm is observed in protein-protein interaction networks of yeast and human in the Database of Interacting Proteins (DIP) and the Biological General Repository for Interaction Datasets (BioGRID) using widely accepted benchmark protein complexes from the comprehensive resource of mammalian protein complexes (CORUM) and the comprehensive catalogue of yeast protein complexes (CYC2008). The outcomes demonstrate that our method can outperform the best-performing supervised, semi-supervised, and unsupervised approaches to detecting protein complexes.
引用
收藏
页数:15
相关论文
共 5 条
  • [1] Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks
    Chen, Bolin
    Wu, Fang-Xiang
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2013, 12 (03) : 165 - 172
  • [2] Mining Protein Complexes from PPI Networks Using the Minimum Vertex Cut
    Xiaojun Ding 1
    2
    1. School of Information Science and Engineering
    2. Department of Computer Science
    TsinghuaScienceandTechnology, 2012, 17 (06) : 674 - 681
  • [3] Mining protein complexes from PPI networks using the minimum vertex cut
    Ding, Xiaojun
    Wang, Weiping
    Peng, Xiaoqing
    Wang, Jianxin
    Tsinghua Science and Technology, 2012, 17 (06) : 674 - 681
  • [4] A novel graph clustering method with a greedy heuristic search algorithm for mining protein complexes from dynamic and static PPI networks
    Wang, Rongquan
    Wang, Caixia
    Liu, Guixia
    INFORMATION SCIENCES, 2020, 522 : 275 - 298
  • [5] A New Sequential Forward Feature Selection (SFFS) Algorithm for Mining Best Topological and Biological Features to Predict Protein Complexes from Protein-Protein Interaction Networks (PPINs)
    Younis, Haseeb
    Anwar, Muhammad Waqas
    Khan, Muhammad Usman Ghani
    Sikandar, Aisha
    Bajwa, Usama Ijaz
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2021, 13 (03) : 371 - 388