Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation

被引:69
作者
Charoenkwan, Phasit [1 ]
Nantasenamat, Chanin [2 ]
Hasan, Md. Mehedi [3 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Data Min & Biomed Informat, Bangkok 10700, Thailand
[3] Kyushu Inst Technol, Dept Biosci & Bioinformat, 680-4 Kawazu, Iizuka, Fukuoka 8208502, Japan
关键词
Phage virion protein; Machine learning; Classification; Feature selection; Support vector machine; Meta-predictor; AROMATASE INHIBITORY-ACTIVITY; WEB SERVER; BACTERIOPHAGE VIRION; FEATURE-SELECTION; IDENTIFICATION; PEPTIDES;
D O I
10.1007/s10822-020-00323-z
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phage virion protein (PVP) perforate the host cell membrane and eventually culminates in cell rupture thereby releasing replicated phages. The accurate identification of PVP is thus a crucial step towards improving our understanding of the biological function and mechanisms of PVPs. Therefore, it is desirable to develop a computational method that is capable of fast and accurate identification of PVPs. To address this, we propose a novel sequence-based meta-predictor employing probabilistic information (referred herein as the Meta-iPVP) for the accurate identification of PVPs. Particularly, efficient feature representation approach was used to generate discriminative probabilistic features from four machine learning (ML) algorithms making use of seven feature encodings. To the best of our knowledge, the Meta-iPVP is the first meta-based approach that has been developed for PVP prediction. Independent test results indicated that the Meta-iPVP could discern important characteristics between PVPs and non-PVPs as well as achieving the best accuracy and MCC of 0.817 and 0.642, respectively, which corresponds to 6-10% and 14-21% improvements over existing PVP predictors. As such, this demonstrates that the proposed Meta-iPVP is a more efficient, robust and promising for the identification of PVPs. The predictive model is deployed as a publicly accessible Meta-iPVP webserver freely available online at.
引用
收藏
页码:1105 / 1116
页数:12
相关论文
共 61 条
[51]  
Shoombuatong W, 2015, EXCLI J, V14, P452, DOI [10.17179/excli2015-140, 10.17179/excli2014-140]
[52]   iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC [J].
Su, Zhen-Dong ;
Huang, Yan ;
Zhang, Zhao-Yue ;
Zhao, Ya-Wei ;
Wang, Dong ;
Chen, Wei ;
Chou, Kuo-Chen ;
Lin, Hao .
BIOINFORMATICS, 2018, 34 (24) :4196-4204
[53]   Identifying Phage Virion Proteins by Using Two-Step Feature Selection Methods [J].
Tan, Jiu-Xin ;
Dao, Fu-Ying ;
Lv, Hao ;
Feng, Peng-Mian ;
Ding, Hui .
MOLECULES, 2018, 23 (08)
[54]   SCMBYK: prediction and characterization of bacterial tyrosine-kinases based on propensity scores of dipeptides [J].
Vasylenko, Tamara ;
Liou, Yi-Fan ;
Chiou, Po-Chin ;
Chu, Hsiao-Wei ;
Lai, Yung-Sung ;
Chou, Yu-Ling ;
Huang, Hui-Ling ;
Ho, Shinn-Ying .
BMC BIOINFORMATICS, 2016, 17
[55]   SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method [J].
Vasylenko, Tamara ;
Liou, Yi-Fan ;
Chen, Hong-An ;
Charoenkwan, Phasit ;
Huang, Hui-Ling ;
Ho, Shinn-Ying .
BMC BIOINFORMATICS, 2015, 16
[56]   PAAP: a web server for predicting antihypertensive activity of peptides [J].
Win, Thet Su ;
Schaduangrat, Nalini ;
Prachayasittikul, Virapong ;
Nantasenamat, Chanin ;
Shoombuatong, Watshara .
FUTURE MEDICINAL CHEMISTRY, 2018, 10 (15) :1749-1767
[57]   HemoPred: a web server for predicting the hemolytic activity of peptides [J].
Win, Thet Su ;
Malik, Aijaz Ahmad ;
Prachayasittikul, Virapong ;
Wikberg, Jarl E. S. ;
Nantasenamat, Chanin ;
Shoombuatong, Watshara .
FUTURE MEDICINAL CHEMISTRY, 2017, 9 (03) :275-291
[58]   protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences [J].
Xiao, Nan ;
Cao, Dong-Sheng ;
Zhu, Min-Feng ;
Xu, Qing-Song .
BIOINFORMATICS, 2015, 31 (11) :1857-1859
[59]   iRNAD: a computational tool for identifying D modification sites in RNA sequence [J].
Xu, Zhao-Chun ;
Feng, Peng-Mian ;
Yang, Hui ;
Qiu, Wang-Ren ;
Chen, Wei ;
Lin, Hao .
BIOINFORMATICS, 2019, 35 (23) :4922-4929
[60]   Proteomic Analysis of a Novel Bacillus Jumbo Phage Revealing Glycoside Hydrolase As Structural Component [J].
Yuan, Yihui ;
Gao, Meiying .
FRONTIERS IN MICROBIOLOGY, 2016, 7