Accurate estimation of isoelectric point of protein and peptide based on amino acid sequences

被引:56
作者
Audain, Enrique [1 ]
Ramos, Yassel [2 ]
Hermjakob, Henning [3 ]
Flower, Darren R. [4 ]
Perez-Riverol, Yasset [3 ]
机构
[1] Ctr Mol Immunol, Dept Prote, Havana, Cuba
[2] Ctr Genet Engn & Biotechnol, Dept Prote, Havana, Cuba
[3] European Bioinformat Inst EMBL EBI, Dept European Mol Biol Lab, Wellcome Trust Genome Campus, Cambridge CB10 1SD, England
[4] Aston Univ, Sch Life & Hlth Sci, Aston Triangle, Birmingham B4 7ET, W Midlands, England
基金
英国生物技术与生命科学研究理事会;
关键词
PROTEOMICS; MASS; PI; FRACTIONATION; DESCRIPTORS;
D O I
10.1093/bioinformatics/btv674
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In any macromolecular polyprotic system-for example protein, DNA or RNA-the isoelectric point-commonly referred to as the pI-can be defined as the point of singularity in a titration curve, corresponding to the solution pH value at which the net overall surface charge-and thus the electrophoretic mobility-of the ampholyte sums to zero. Different modern analytical biochemistry and proteomics methods depend on the isoelectric point as a principal feature for protein and peptide characterization. Protein separation by isoelectric point is a critical part of 2-D gel electrophoresis, a key precursor of proteomics, where discrete spots can be digested in-gel, and proteins subsequently identified by analytical mass spectrometry. Peptide fractionation according to their pI is also widely used in current proteomics sample preparation procedures previous to the LC-MS/MS analysis. Therefore accurate theoretical prediction of pI would expedite such analysis. While such pI calculation is widely used, it remains largely untested, motivating our efforts to benchmark pI prediction methods. Results: Using data from the database PIP-DB and one publically available dataset as our reference gold standard, we have undertaken the benchmarking of pI calculation methods. We find that methods vary in their accuracy and are highly sensitive to the choice of basis set. The machine-learning algorithms, especially the SVM-based algorithm, showed a superior performance when studying peptide mixtures. In general, learning-based pI prediction methods (such as Cofactor, SVM and Branca) require a large training dataset and their resulting performance will strongly depend of the quality of that data. In contrast with Iterative methods, machine-learning algorithms have the advantage of being able to add new features to improve the accuracy of prediction.
引用
收藏
页码:821 / 827
页数:7
相关论文
共 25 条
[1]   A Survey of Molecular Descriptors Used in Mass Spectrometry Based Proteomics [J].
Audain, Enrique ;
Sanchez, Aniel ;
Vizcaino, Juan Antonio ;
Perez-Riverol, Yasset .
CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2014, 14 (03) :388-397
[2]   THE FOCUSING POSITIONS OF POLYPEPTIDES IN IMMOBILIZED PH GRADIENTS CAN BE PREDICTED FROM THEIR AMINO-ACID-SEQUENCES [J].
BJELLQVIST, B ;
HUGHES, GJ ;
PASQUALI, C ;
PAQUET, N ;
RAVIER, F ;
SANCHEZ, JC ;
FRUTIGER, S ;
HOCHSTRASSER, D .
ELECTROPHORESIS, 1993, 14 (10) :1023-1031
[3]  
Branca RMM, 2014, NAT METHODS, V11, P59, DOI [10.1038/nmeth.2732, 10.1038/NMETH.2732]
[4]   PIP-DB: the Protein Isoelectric Point database [J].
Bunkute, Egle ;
Cummins, Christopher ;
Crofts, Fraser J. ;
Bunce, Gareth ;
Nabney, Ian T. ;
Flower, Darren R. .
BIOINFORMATICS, 2015, 31 (02) :295-296
[5]   Calculation of the isoelectric point of tryptic peptides in the pH 3.5-4.5 range based on adjacent amino acid effects [J].
Cargile, Benjamin J. ;
Sevinsky, Joel R. ;
Essader, Amal S. ;
Eu, Jerry P. ;
Stephenson, James L., Jr. .
ELECTROPHORESIS, 2008, 29 (13) :2768-2778
[6]   Gel based isoelectric focusing of peptides and the utility of isoelectric point in protein identification [J].
Cargile, BJ ;
Bundy, JL ;
Freeman, TW ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (01) :112-119
[7]   Isoelectric points of multi-domain proteins [J].
Carugo, Oliviero .
BIOINFORMATION, 2007, 2 (03) :101-104
[8]   A versatile peptide pI calculator for phosphorylated and N-terminal acetylated peptides experimentally tested using peptide isoelectric focusing [J].
Gauci, Sharon ;
Van Breukelen, Bas ;
Lemeer, Simone M. ;
Krijgsveld, Jeroen ;
Heck, Albert J. R. .
PROTEOMICS, 2008, 8 (23-24) :4898-4906
[9]  
Halligan Brian D., 2009, V527, P283, DOI 10.1007/978-1-60327-834-8_21
[10]   Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides [J].
Heller, M ;
Ye, ML ;
Michel, PE ;
Morier, P ;
Stalder, D ;
Jünger, MA ;
Aebersold, R ;
Reymond, FR ;
Rossier, JS .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (06) :2273-2282