Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation

被引:69
作者
Charoenkwan, Phasit [1 ]
Nantasenamat, Chanin [2 ]
Hasan, Md. Mehedi [3 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Data Min & Biomed Informat, Bangkok 10700, Thailand
[3] Kyushu Inst Technol, Dept Biosci & Bioinformat, 680-4 Kawazu, Iizuka, Fukuoka 8208502, Japan
关键词
Phage virion protein; Machine learning; Classification; Feature selection; Support vector machine; Meta-predictor; AROMATASE INHIBITORY-ACTIVITY; WEB SERVER; BACTERIOPHAGE VIRION; FEATURE-SELECTION; IDENTIFICATION; PEPTIDES;
D O I
10.1007/s10822-020-00323-z
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phage virion protein (PVP) perforate the host cell membrane and eventually culminates in cell rupture thereby releasing replicated phages. The accurate identification of PVP is thus a crucial step towards improving our understanding of the biological function and mechanisms of PVPs. Therefore, it is desirable to develop a computational method that is capable of fast and accurate identification of PVPs. To address this, we propose a novel sequence-based meta-predictor employing probabilistic information (referred herein as the Meta-iPVP) for the accurate identification of PVPs. Particularly, efficient feature representation approach was used to generate discriminative probabilistic features from four machine learning (ML) algorithms making use of seven feature encodings. To the best of our knowledge, the Meta-iPVP is the first meta-based approach that has been developed for PVP prediction. Independent test results indicated that the Meta-iPVP could discern important characteristics between PVPs and non-PVPs as well as achieving the best accuracy and MCC of 0.817 and 0.642, respectively, which corresponds to 6-10% and 14-21% improvements over existing PVP predictors. As such, this demonstrates that the proposed Meta-iPVP is a more efficient, robust and promising for the identification of PVPs. The predictive model is deployed as a publicly accessible Meta-iPVP webserver freely available online at.
引用
收藏
页码:1105 / 1116
页数:12
相关论文
共 61 条
[1]   Pred-BVP-Unb: Fast prediction of bacteriophage Virion proteins using un-biased multi-perspective properties with recursive feature elimination [J].
Arif, Muhammad ;
Ali, Farman ;
Ahmad, Saeed ;
Kabir, Muhammad ;
Ali, Zakir ;
Hayat, Maqsood .
GENOMICS, 2020, 112 (02) :1565-1574
[2]   Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening [J].
Basith, Shaherin ;
Manavalan, Balachandran ;
Shin, Tae Hwan ;
Lee, Gwang .
MEDICINAL RESEARCH REVIEWS, 2020, 40 (04) :1276-1314
[3]   UniProt: a worldwide hub of protein knowledge [J].
Bateman, Alex ;
Martin, Maria-Jesus ;
Orchard, Sandra ;
Magrane, Michele ;
Alpi, Emanuele ;
Bely, Benoit ;
Bingley, Mark ;
Britto, Ramona ;
Bursteinas, Borisas ;
Busiello, Gianluca ;
Bye-A-Jee, Hema ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzales, Daniel ;
Gonzales, Leonardo ;
Hatton-Ellis, Emma ;
Ignatchenko, Alexandr ;
Ishtiaq, Rizwan ;
Jokinen, Petteri ;
Joshi, Vishal ;
Jyothi, Dushyanth ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Madeira, Fabio ;
Mahmoudy, Mahdi ;
Menchi, Manuela ;
Nightingale, Andrew ;
Onwubiko, Joseph ;
Palka, Barbara ;
Pichler, Klemens ;
Pundir, Sangya ;
Qi, Guoying ;
Raj, Shriya ;
Renaux, Alexandre ;
Lopez, Milagros Rodriguez ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Vasudev, Preethi ;
Volynkin, Vladimir ;
Wardell, Tony .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D506-D515
[4]   Prediction of antioxidant proteins by incorporating statistical moments based features into Chou's PseAAC [J].
Butt, Ahmad Hassan ;
Rasool, Nouman ;
Khan, Yaser Daanial .
JOURNAL OF THEORETICAL BIOLOGY, 2019, 473 :1-8
[5]   iBitter-SCM: Identi fication and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides [J].
Charoenkwan, Phasit ;
Yana, Janchai ;
Schaduangrat, Nalini ;
Nantasenamat, Chanin ;
Hasan, Md Mehedi ;
Shoombuatong, Watshara .
GENOMICS, 2020, 112 (04) :2813-2822
[6]   iTTCA-Hybrid: Improved and robust identification of tumor T cell antigens by utilizing hybrid feature representation [J].
Charoenkwan, Phasit ;
Nantasenamat, Chanin ;
Hasan, Md Mehedi ;
Shoombuatong, Watshara .
ANALYTICAL BIOCHEMISTRY, 2020, 599
[7]   PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method [J].
Charoenkwan, Phasit ;
Kanthawong, Sakawrat ;
Schaduangrat, Nalini ;
Yana, Janchai ;
Shoombuatong, Watshara .
CELLS, 2020, 9 (02)
[8]   iQSP: A Sequence-Based Tool for the Prediction and Analysis of Quorum Sensing Peptides via Chou's 5-Steps Rule and Informative Physicochemical Properties [J].
Charoenkwan, Phasit ;
Schaduangrat, Nalini ;
Nantasenamat, Chanin ;
Piacham, Theeraphon ;
Shoombuatong, Watshara .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (01)
[9]   SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs [J].
Charoenkwan, Phasit ;
Shoombuatong, Watshara ;
Lee, Hua-Chin ;
Chaijaruwanich, Jeerayut ;
Huang, Hui-Ling ;
Ho, Shinn-Ying .
PLOS ONE, 2013, 8 (09)
[10]   Recent Advances of Computational Methods for Identifying Bacteriophage Virion Proteins [J].
Chen, Wei ;
Nie, Fulei ;
Ding, Hui .
PROTEIN AND PEPTIDE LETTERS, 2020, 27 (04) :259-264