Identification of Human Protein Subcellular Location with Multiple Networks

被引:11
作者
Wang, Rui [1 ]
Chen, Lei [1 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
关键词
Protein subcellular location; multiple networks; network embedding algorithm; Mnode2vec; node2vec; classification algorithm; AMINO-ACID-COMPOSITION; FUNCTIONAL DOMAIN COMPOSITION; PREDICTION; LOCALIZATION;
D O I
10.2174/1570164619666220531113704
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Protein function is closely related to its location within the cell. Determination of protein subcellular location is helpful in uncovering its functions. However, traditional biological experiments to determine the subcellular location are of high cost and low efficiency, which cannot meet today's needs. In recent years, many computational models have been set up to identify the subcellular location of proteins. Most models use features derived from protein sequences. Recently, features extracted from the protein-protein interaction (PPI) network have become popular in studying various protein-related problems. Objective: A novel model with features derived from multiple PPI networks was proposed to predict protein subcellular location. Methods: Protein features were obtained by a newly designed network embedding algorithm, Mnode2vec, which is a generalized version of the classic Node2vec algorithm. Two classic classification algorithms: support vector machine and random forest, were employed to build the model. Results: Such model provided good performance and was superior to the model with features extracted by Node2vec. Also, this model outperformed some classic models. Furthermore, Mnode2vec was found to produce powerful features when the path length was small. Conclusion: The proposed model can be a powerful tool to determine protein subcellular location, and Mnode2vec can efficiently extract informative features from multiple networks.
引用
收藏
页码:344 / 356
页数:13
相关论文
共 61 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] A deep learning architecture for metabolic pathway prediction
    Baranwal, Mayank
    Magner, Abram
    Elvati, Paolo
    Saldinger, Jacob
    Violi, Angela
    Hero, Alfred O.
    [J]. BIOINFORMATICS, 2020, 36 (08) : 2547 - 2553
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition
    Cai, YD
    Chou, KC
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2003, 305 (02) : 407 - 411
  • [5] Relation between amino acid composition and cellular location of proteins
    Cedano, J
    Aloy, P
    PerezPons, JA
    Querol, E
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) : 594 - 600
  • [6] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [7] Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features
    Chen, Lei
    Li, ZhanDong
    Zeng, Tao
    Zhang, Yu-Hang
    Zhang, ShiQi
    Huang, Tao
    Cai, Yu-Dong
    [J]. FRONTIERS IN GENETICS, 2021, 12
  • [8] Identify Key Sequence Features to Improve CRISPR sgRNA Efficacy
    Chen, Lei
    Wang, Shaopeng
    Zhang, Yu-Hang
    Li, Jiarui
    Xing, Zhi-Hao
    Yang, Jialiang
    Huang, Tao
    Cai, Yu-Dong
    [J]. IEEE ACCESS, 2017, 5 : 26582 - 26590
  • [9] iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach
    Chen, Wei
    Chen, Lei
    Dai, Qi
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [10] pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information
    Cheng, Xiang
    Xiao, Xuan
    Chou, Kuo-Chen
    [J]. BIOINFORMATICS, 2018, 34 (09) : 1448 - 1456