Identifying Protein Subcellular Location with Embedding Features Learned from Networks

被引:43
作者
Liu, Hongwei [1 ]
Hu, Bin [2 ]
Chen, Lei [1 ]
Lu, Lin [3 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[2] Guangdong Acad Agr Sci, Guangdong Publ Lab Anim Breeding & Nutr, State Key Lab Livestock & Poultry Breeding, Inst Anim Sci,Guangdong Prov Key Lab Anim Breedin, Guangzhou 510640, Peoples R China
[3] Columbia Univ, Dept Radiol, Med Ctr, New York, NY USA
关键词
Protein subcellular location prediction; network embedding algorithm; deepWalk; Node2vec; mashup; machine learning algorithm; support vector machine; random forest; AMINO-ACID-COMPOSITION; FUNCTIONAL DOMAIN COMPOSITION; PREDICTION; LOCALIZATION; ALGORITHM;
D O I
10.2174/1570164617999201124142950
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identification of protein subcellular location is an important problem be-cause the subcellular location is highly related to protein function. It is fundamental to determine the locations with biology experiments. However, these experiments are of high costs and time-con-suming. The alternative way to address such a problem is to design effective computational meth-ods. Objective: To date, several computational methods have been proposed in this regard. However, th-ese methods mainly adopted the features derived from the proteins themselves. On the other hand, with the development of the network technique, several embedding algorithms have been pro-posed, which can encode nodes in the network into feature vectors. Such algorithms connected the network and traditional classification algorithms. Thus, they provided a new way to construct mod -els for the prediction of protein subcellular location. Methods: In this study, we analyzed features produced by three network embedding algorithms (DeepWalk, Node2vec and Mashup) that were applied on one or multiple protein networks. Ob-tained features were learned by one machine learning algorithm (support vector machine or ran-dom forest) to construct the model. The cross-validation method was adopted to evaluate all con-structed models. Results: After evaluating models with the cross-validation method, embedding features yielded by Mashup on multiple networks were quite informative for predicting protein subcellular location. The model based on these features were superior to some classic models. Conclusion: Embedding features yielded by a proper and powerful network embedding algorithm were effective for building the model for prediction of protein subcellular location, providing new pipelines to build more efficient models.
引用
收藏
页码:646 / 660
页数:15
相关论文
共 50 条
  • [31] Augmented sequence features and subcellular localization for functional characterization of unknown protein sequences
    Agrawal, Saurabh
    Sisodia, Dilip Singh
    Nagwani, Naresh Kumar
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2021, 59 (11-12) : 2297 - 2310
  • [32] Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features
    Chen, Lei
    Li, ZhanDong
    Zeng, Tao
    Zhang, Yu-Hang
    Zhang, ShiQi
    Huang, Tao
    Cai, Yu-Dong
    FRONTIERS IN GENETICS, 2021, 12
  • [33] PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection
    Ullah, Matee
    Han, Ke
    Hadi, Fazal
    Xu, Jian
    Song, Jiangning
    Yu, Dong-Jun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [34] Prediction of protein subcellular location using a combined feature of sequence
    Gao, QB
    Wang, ZZ
    Yan, C
    Du, YH
    FEBS LETTERS, 2005, 579 (16): : 3444 - 3448
  • [35] A method for identifying protein complexes with the features of joint co-localization and joint co-expression in static PPI networks
    Zhang, Jinxiong
    Zhong, Cheng
    Huang, Yiran
    Lin, Hai Xiang
    Wang, Mian
    COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 111
  • [36] Identifying Protein Complexes from Dynamic Temporal Interval Protein-Protein Interaction Networks
    Zhang, Jinxiong
    Zhong, Cheng
    Lin, Hai Xiang
    Wang, Mian
    BIOMED RESEARCH INTERNATIONAL, 2019, 2019
  • [37] Structure Analysis, Prokaryotic Expression, and Subcellular Location of a New Protein Phosphatase 2C Protein from Maize
    Liu, Lixia
    JOURNAL OF PURE AND APPLIED MICROBIOLOGY, 2013, 7 : 19 - 25
  • [38] Image-based classification of protein subcellular location patterns in human reproductive tissue by ensemble learning global and local features
    Yang, Fan
    Xu, Ying-Ying
    Wang, Shi-Tong
    Shen, Hong-Bin
    NEUROCOMPUTING, 2014, 131 : 113 - 123
  • [39] Immunostaining: Detection of Signaling Protein Location in Tissues, Cells and Subcellular Compartments
    Maity, Biswanath
    Sheff, David
    Fisher, Rory A.
    LABORATORY METHODS IN CELL BIOLOGY: IMAGING, 2012, 113 : 81 - 105
  • [40] Subcellular location of the coupling protein TrwB and the role of its transmembrane domain
    Segura, Rosa L.
    Aguila-Arcos, Sandra
    Ugarte-Uribe, Begona
    Vecino, Ana J.
    de la Cruz, Fernando
    Goni, Felix M.
    Alkorta, Itziar
    BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES, 2014, 1838 (01): : 223 - 230