Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties

被引:59
作者
Cui, Juan
Han, Lian Yi
Li, Hu
Ung, Choong Yong
Tang, Zhi Qun
Zheng, Chan Juan
Cao, Zhi Wei
Chen, Yu Zong
机构
[1] Natl Univ Singapore, Dept Pharm & Computat Sci, Bioinformat & Drug Design Grp, Singapore 117543, Singapore
[2] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
关键词
allergen; immunology; statistical learning method; support vector machine;
D O I
10.1016/j.molimm.2006.02.010
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: Computational methods have been developed for predicting allergen proteins from sequence segments that show identity, homology, or motif match to a known allergen. These methods achieve good prediction accuracies, but are less effective for novel proteins with no similarity to any known allergen. Methods: This work tests the feasibility of using a statistical learning method, support vector machines, as such a method. The prediction system is trained and tested by using 1005 allergen proteins from the Allergome database and 22,469 non-allergen proteins from 7871 Pfam families. Results: Testing results by an independent set of 229 allergen and 6717 non-allergen proteins from 7871 Pfam families show that 93.0% and 99.9% of these are correctly predicted, which are comparable to the best results of other methods. Of the 18 novel allergen proteins non-homologous to any other proteins in the Swissprot database, 88.9% is correctly predicted. A further screening of 168,128 proteins in the Swissprot database finds that 2.9% of the proteins are predicted as allergen proteins, which is consistent with the estimated numbers from motif-based methods. Conclusions: Our study suggests that SVM is a potentially useful method for predicting allergen proteins and it has certain capability for predicting novel allergen proteins. Our software can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/APPEL. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:514 / 520
页数:7
相关论文
共 43 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[3]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[4]   Advances in mechanisms of allergy [J].
Bochner, BS ;
Busse, WW .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2004, 113 (05) :868-875
[5]   What establishes a protein as an allergen? [J].
Bredehorst, R ;
David, K .
JOURNAL OF CHROMATOGRAPHY B, 2001, 756 (1-2) :33-40
[6]   Molecular and biochemical classification of plant-derived food allergens [J].
Breiteneder, H ;
Ebner, C .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2000, 106 (01) :27-36
[7]   Nonspecific lipid-transfer proteins in plant foods and pollens: an important allergen class [J].
Breiteneder, H ;
Mills, C .
CURRENT OPINION IN ALLERGY AND CLINICAL IMMUNOLOGY, 2005, 5 (03) :275-279
[8]   Plant food allergens - structural and functional aspects of allergenicity [J].
Breiteneder, H ;
Mills, ENC .
BIOTECHNOLOGY ADVANCES, 2005, 23 (06) :395-399
[9]   Molecular properties of food allergens [J].
Breiteneder, H ;
Mills, ENC .
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2005, 115 (01) :14-23
[10]   Atopic allergens of plant foods [J].
Breiteneder, Heimo ;
Ebner, Christof .
CURRENT OPINION IN ALLERGY AND CLINICAL IMMUNOLOGY, 2001, 1 (03) :261-267