Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation

被引:65
作者
Charoenkwan, Phasit [1 ]
Nantasenamat, Chanin [2 ]
Hasan, Md. Mehedi [3 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Data Min & Biomed Informat, Bangkok 10700, Thailand
[3] Kyushu Inst Technol, Dept Biosci & Bioinformat, 680-4 Kawazu, Iizuka, Fukuoka 8208502, Japan
关键词
Phage virion protein; Machine learning; Classification; Feature selection; Support vector machine; Meta-predictor; AROMATASE INHIBITORY-ACTIVITY; WEB SERVER; BACTERIOPHAGE VIRION; FEATURE-SELECTION; IDENTIFICATION; PEPTIDES;
D O I
10.1007/s10822-020-00323-z
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phage virion protein (PVP) perforate the host cell membrane and eventually culminates in cell rupture thereby releasing replicated phages. The accurate identification of PVP is thus a crucial step towards improving our understanding of the biological function and mechanisms of PVPs. Therefore, it is desirable to develop a computational method that is capable of fast and accurate identification of PVPs. To address this, we propose a novel sequence-based meta-predictor employing probabilistic information (referred herein as the Meta-iPVP) for the accurate identification of PVPs. Particularly, efficient feature representation approach was used to generate discriminative probabilistic features from four machine learning (ML) algorithms making use of seven feature encodings. To the best of our knowledge, the Meta-iPVP is the first meta-based approach that has been developed for PVP prediction. Independent test results indicated that the Meta-iPVP could discern important characteristics between PVPs and non-PVPs as well as achieving the best accuracy and MCC of 0.817 and 0.642, respectively, which corresponds to 6-10% and 14-21% improvements over existing PVP predictors. As such, this demonstrates that the proposed Meta-iPVP is a more efficient, robust and promising for the identification of PVPs. The predictive model is deployed as a publicly accessible Meta-iPVP webserver freely available online at.
引用
收藏
页码:1105 / 1116
页数:12
相关论文
共 61 条
  • [1] [Anonymous], 2013, COMPUT MATH METHOD M, DOI DOI 10.1155/2013/530696
  • [2] Pred-BVP-Unb: Fast prediction of bacteriophage Virion proteins using un-biased multi-perspective properties with recursive feature elimination
    Arif, Muhammad
    Ali, Farman
    Ahmad, Saeed
    Kabir, Muhammad
    Ali, Zakir
    Hayat, Maqsood
    [J]. GENOMICS, 2020, 112 (02) : 1565 - 1574
  • [3] Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening
    Basith, Shaherin
    Manavalan, Balachandran
    Shin, Tae Hwan
    Lee, Gwang
    [J]. MEDICINAL RESEARCH REVIEWS, 2020, 40 (04) : 1276 - 1314
  • [4] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [5] Prediction of antioxidant proteins by incorporating statistical moments based features into Chou's PseAAC
    Butt, Ahmad Hassan
    Rasool, Nouman
    Khan, Yaser Daanial
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2019, 473 : 1 - 8
  • [6] iBitter-SCM: Identi fication and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides
    Charoenkwan, Phasit
    Yana, Janchai
    Schaduangrat, Nalini
    Nantasenamat, Chanin
    Hasan, Md Mehedi
    Shoombuatong, Watshara
    [J]. GENOMICS, 2020, 112 (04) : 2813 - 2822
  • [7] iTTCA-Hybrid: Improved and robust identification of tumor T cell antigens by utilizing hybrid feature representation
    Charoenkwan, Phasit
    Nantasenamat, Chanin
    Hasan, Md Mehedi
    Shoombuatong, Watshara
    [J]. ANALYTICAL BIOCHEMISTRY, 2020, 599
  • [8] PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method
    Charoenkwan, Phasit
    Kanthawong, Sakawrat
    Schaduangrat, Nalini
    Yana, Janchai
    Shoombuatong, Watshara
    [J]. CELLS, 2020, 9 (02)
  • [9] iQSP: A Sequence-Based Tool for the Prediction and Analysis of Quorum Sensing Peptides via Chou's 5-Steps Rule and Informative Physicochemical Properties
    Charoenkwan, Phasit
    Schaduangrat, Nalini
    Nantasenamat, Chanin
    Piacham, Theeraphon
    Shoombuatong, Watshara
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (01)
  • [10] SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs
    Charoenkwan, Phasit
    Shoombuatong, Watshara
    Lee, Hua-Chin
    Chaijaruwanich, Jeerayut
    Huang, Hui-Ling
    Ho, Shinn-Ying
    [J]. PLOS ONE, 2013, 8 (09):