PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method

被引:50
作者
Charoenkwan, Phasit [1 ]
Kanthawong, Sakawrat [2 ]
Schaduangrat, Nalini [3 ]
Yana, Janchai [4 ]
Shoombuatong, Watshara [3 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Khon Kaen Univ, Dept Microbiol, Fac Med, Khon Kaen 40002, Thailand
[3] Mahidol Univ, Fac Med Technol, Ctr Data Min & Biomed Informat, Bangkok 10700, Thailand
[4] Chiang Mai Rajabhat Univ, Fac Sci & Technol, Dept Chem, Chiang Mai 50300, Thailand
关键词
phage virion protein; scoring card method; propensity score; interpretable model; physicochemical properties; machine learning; AROMATASE INHIBITORY-ACTIVITY; WEB SERVER; COAT PROTEIN; SUBCELLULAR-LOCALIZATION; BACTERIOPHAGE VIRION; BIOACTIVITY; CLASSIFIER; PEPTIDES; DNA;
D O I
10.3390/cells9020353
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Although, existing methods have been successful in predicting phage (or bacteriophage) virion proteins (PVPs) using various types of protein features and complex classifiers, such as support vector machine and naive Bayes, these two methods do not allow interpretability. However, the characterization and analysis of PVPs might be of great significance to understanding the molecular mechanisms of bacteriophage genetics and the development of antibacterial drugs. Hence, we herein proposed a novel method (PVPred-SCM) based on the scoring card method (SCM) in conjunction with dipeptide composition to identify and characterize PVPs. In PVPred-SCM, the propensity scores of 400 dipeptides were calculated using the statistical discrimination approach. Rigorous independent validation test showed that PVPred-SCM utilizing only dipeptide composition yielded an accuracy of 77.56%, indicating that PVPred-SCM performed well relative to the state-of-the-art method utilizing a number of protein features. Furthermore, the propensity scores of dipeptides were used to provide insights into the biochemical and biophysical properties of PVPs. Upon comparison, it was found that PVPred-SCM was superior to the existing methods considering its simplicity, interpretability, and implementation. Finally, in an effort to facilitate high-throughput prediction of PVPs, we provided a user-friendly web-server for identifying the likelihood of whether or not these sequences are PVPs. It is anticipated that PVPred-SCM will become a useful tool or at least a complementary existing method for predicting and analyzing PVPs.
引用
收藏
页数:22
相关论文
共 82 条
  • [1] ACKERMANN HW, 1987, MICROBIOL SCI, V4, P214
  • [2] [Anonymous], 2013, COMPUT MATH METHOD M, DOI DOI 10.1155/2013/530696
  • [3] Pred-BVP-Unb: Fast prediction of bacteriophage Virion proteins using un-biased multi-perspective properties with recursive feature elimination
    Arif, Muhammad
    Ali, Farman
    Ahmad, Saeed
    Kabir, Muhammad
    Ali, Zakir
    Hayat, Maqsood
    [J]. GENOMICS, 2020, 112 (02) : 1565 - 1574
  • [4] Of capsid structure and stability: The partnership between charged residues of E-loop and P-domain of the bacteriophage P22 coat protein
    Asija, Kunica
    Teschke, Carolyn M.
    [J]. VIROLOGY, 2019, 534 : 45 - 53
  • [5] iQSP: A Sequence-Based Tool for the Prediction and Analysis of Quorum Sensing Peptides via Chou's 5-Steps Rule and Informative Physicochemical Properties
    Charoenkwan, Phasit
    Schaduangrat, Nalini
    Nantasenamat, Chanin
    Piacham, Theeraphon
    Shoombuatong, Watshara
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (01)
  • [6] SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs
    Charoenkwan, Phasit
    Shoombuatong, Watshara
    Lee, Hua-Chin
    Chaijaruwanich, Jeerayut
    Huang, Hui-Ling
    Ho, Shinn-Ying
    [J]. PLOS ONE, 2013, 8 (09):
  • [7] iFeature: a Python']Python package and web server for features extraction and selection from protein and peptide sequences
    Chen, Zhen
    Zhao, Pei
    Li, Fuyi
    Leier, Andre
    Marquez-Lago, Tatiana T.
    Wang, Yanan
    Webb, Geoffrey I.
    Smith, A. Ian
    Daly, Roger J.
    Chou, Kuo-Chen
    Song, Jiangning
    [J]. BIOINFORMATICS, 2018, 34 (14) : 2499 - 2502
  • [8] pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites
    Cheng, Xiang
    Zhao, Shu-Guang
    Lin, Wei-Zhong
    Xiao, Xuan
    Chou, Kuo-Chen
    [J]. BIOINFORMATICS, 2017, 33 (22) : 3524 - 3531
  • [9] pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC
    Cheng, Xiang
    Xiao, Xuan
    Chou, Kuo-Chen
    [J]. GENE, 2017, 628 : 315 - 321
  • [10] iATC-mHyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals
    Cheng, Xiang
    Zhao, Shu-Guang
    Xiao, Xuan
    Chou, Kuo-Chen
    [J]. ONCOTARGET, 2017, 8 (35) : 58494 - 58503