PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations

被引:10
|
作者
Auliah, Firda Nurul [1 ]
Nilamyani, Andi Nur [1 ]
Shoombuatong, Watshara [2 ]
Alam, Md Ashad [3 ]
Hasan, Md Mehedi [1 ,4 ]
Kurata, Hiroyuki [1 ]
机构
[1] Kyushu Inst Technol, Dept Biosci & Bioinformat, 680-4 Kawazu, Iizuka, Fukuoka 8208502, Japan
[2] Mahidol Univ, Fac Med Technol, Ctr Data Min & Biomed Informat, Bangkok 10700, Thailand
[3] Tulane Univ, Tulane Ctr Biomed Informat & Genom, Div Biomed Informat & Genom, John W Deming Dept Med,Sch Med, New Orleans, LA 70112 USA
[4] Japan Soc Promot Sci, Chiyoda Ku, 5-3-1 Kojimachi, Tokyo 1020083, Japan
基金
日本学术振兴会;
关键词
pupylation; feature encoding; chi-squared; machine learning; BIOINFORMATICS TOOLS; IDENTIFICATION; DATABASE; DOP;
D O I
10.3390/ijms22042120
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [1] GPS-PUP: computational prediction of pupylation sites in prokaryotic proteins
    Liu, Zexian
    Ma, Qian
    Cao, Jun
    Gao, Xinjiao
    Ren, Jian
    Xue, Yu
    MOLECULAR BIOSYSTEMS, 2011, 7 (10) : 2737 - 2740
  • [2] Position-Specific Analysis and Prediction of Protein Pupylation Sites Based on Multiple Features
    Zhao, Xiaowei
    Dai, Jiangyan
    Ning, Qiao
    Ma, Zhiqiang
    Yin, Minghao
    Sun, Pingping
    BIOMED RESEARCH INTERNATIONAL, 2013, 2013
  • [3] Integrating protein secondary structure prediction and multiple sequence alignment
    Simossis, VA
    Heringa, J
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2004, 5 (04) : 249 - 266
  • [4] PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features
    Nilamyani, Andi Nur
    Auliah, Firda Nurul
    Moni, Mohammad Ali
    Shoombuatong, Watshara
    Hasan, Md Mehedi
    Kurata, Hiroyuki
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (05) : 1 - 11
  • [5] PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine
    Yongchao Dou
    Bo Yao
    Chi Zhang
    Amino Acids, 2014, 46 : 1459 - 1469
  • [6] PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine
    Dou, Yongchao
    Yao, Bo
    Zhang, Chi
    AMINO ACIDS, 2014, 46 (06) : 1459 - 1469
  • [7] Prediction of Protein Lysine Acylation by Integrating Primary Sequence Information with Multiple Functional Features
    Du, Yipeng
    Zhai, Zichao
    Li, Ying
    Lu, Ming
    Cai, Tanxi
    Zhou, Bo
    Huang, Lei
    Wei, Taotao
    Li, Tingting
    JOURNAL OF PROTEOME RESEARCH, 2016, 15 (12) : 4234 - 4244
  • [8] Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning
    Adhikari, Badri
    Hou, Jie
    Cheng, Jianlin
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2018, 86 : 84 - 96
  • [9] Prediction of 2-hydroxyisobutyrylation sites by integrating multiple sequence features with ensemble support vector machine
    Ju, Zhe
    Wang, Shi-Yun
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2020, 87
  • [10] Integrating multiple networks for protein function prediction
    Yu, Guoxian
    Zhu, Hailong
    Domeniconi, Carlotta
    Guo, Maozu
    BMC SYSTEMS BIOLOGY, 2015, 9