A new computational strategy for identifying essential proteins based on network topological properties and biological information

被引:15
作者
Qin, Chao [1 ]
Sun, Yongqi [1 ]
Dong, Yadong [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China
来源
PLOS ONE | 2017年 / 12卷 / 07期
基金
中国国家自然科学基金;
关键词
ESSENTIAL GENE IDENTIFICATION; SUBCELLULAR-LOCALIZATION; CENTRALITY; ORTHOLOGY; GENOME; INTERACTOME; INTEGRATION; PREDICTION; DATABASE;
D O I
10.1371/journal.pone.0182031
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively.
引用
收藏
页数:24
相关论文
共 42 条
  • [11] Hall M., 2009, SIGKDD EXPLORATIONS, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]
  • [12] Essential gene identification and drug target prioritization in Aspergillus fumigatus
    Hu, Wenqi
    Sillaots, Susan
    Lemieux, Sebastien
    Davison, John
    Kauffman, Sarah
    Breton, Anouk
    Linteau, Annie
    Xin, Chunlin
    Bowman, Joel
    Becker, Jeff
    Jiang, Bo
    Roemer, Terry
    [J]. PLOS PATHOGENS, 2007, 3 (03)
  • [13] Saccharomyces genome database
    Issel-Tarver, L
    Christie, KR
    Dolinski, K
    Andrada, R
    Balakrishnan, R
    Ball, CA
    Binkley, G
    Dong, S
    Dwight, SS
    Fisk, DG
    Harris, M
    Schroeder, M
    Sethuraman, A
    Tse, K
    Weng, S
    Botstein, D
    Cherry, JM
    [J]. GUIDE TO YEAST GENETICS AND MOLECULAR AND CELL BIOLOGY, PT B, 2002, 350 : 329 - 346
  • [14] Predicting essential proteins based on subcellular localization, orthology and PPI networks
    Li, Gaoshi
    Li, Min
    Wang, Jianxin
    Wu, Jingli
    Wu, Fang-Xiang
    Pan, Yi
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [15] United Complex Centrality for Identification of Essential Proteins from PPI Networks
    Li, Min
    Lu, Yu
    Niu, Zhibei
    Wu, Fang-Xiang
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (02) : 370 - 380
  • [16] A Reliable Neighbor-Based Method for Identifying Essential Proteins by Integrating Gene Expressions, Orthology, and Subcellular Localization Information
    Li, Min
    Niu, Zhibei
    Chen, Xiaopei
    Zhong, Ping
    Wu, Fangxiang
    Pan, Yi
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2016, 21 (06) : 668 - 677
  • [17] Prioritizing Disease Genes by Using Search Engine Algorithm
    Li, Min
    Zheng, Ruiqing
    Li, Qi
    Wang, Jianxin
    Wu, Fang-Xiang
    Zhang, Zhuohua
    [J]. CURRENT BIOINFORMATICS, 2016, 11 (02) : 195 - 202
  • [18] A Topology Potential-Based Method for Identifying Essential Proteins from PPI Networks
    Li, Min
    Lu, Yu
    Wang, Jianxin
    Wu, Fang-Xiang
    Pan, Yi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (02) : 372 - 383
  • [19] A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data
    Li, Min
    Zhang, Hanhui
    Wang, Jian-xin
    Pan, Yi
    [J]. BMC SYSTEMS BIOLOGY, 2012, 6
  • [20] A local average connectivity-based method for identifying essential proteins from the network level
    Li, Min
    Wang, Jianxin
    Chen, Xiang
    Wang, Huan
    Pan, Yi
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2011, 35 (03) : 143 - 150