共 1 条
PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data
被引:11
|作者:
El-Manzalawy, Yasser
[1
]
Munoz, Elyse E.
[2
]
Lindner, Scott E.
[2
]
Honavar, Vasant
[1
]
机构:
[1] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
[2] Penn State Univ, Dept Biochem & Mol Biol, Ctr Malaria Res, University Pk, PA 16802 USA
来源:
基金:
美国国家卫生研究院;
关键词:
Bioinformatics;
Malaria;
Plasmodium;
Predicting surface-exposed proteins;
Semisupervised learning;
Surface-exposed proteomics;
REVERSE VACCINOLOGY;
IDENTIFICATION;
TECHNOLOGIES;
CANDIDATE;
EFFICACY;
D O I:
10.1002/pmic.201600249
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.
引用
收藏
页码:2967 / 2976
页数:10
相关论文