Support-vector-machine classification of linear functional motifs in proteins

被引：7

作者：

Plewczynski, D

Tkacz, A

Wyrwicz, L

Godzik, A

Kloczkowski, A

Rychlewski, L

机构：

[1] Univ Warsaw, Interdisciplinary Ctr Math & Computat Modeling, PL-02106 Warsaw, Poland

[2] BioInfoBank Inst, PL-60744 Poznan, Poland

[3] Adam Mickiewicz Univ Poznan, Bioinformat Unit, Dept Phys, PL-61614 Poznan, Poland

[4] Univ Calif San Diego, Bioinformat Core JCSG, La Jolla, CA 92093 USA

[5] Burnham Inst, La Jolla, CA 92037 USA

[6] Iowa State Univ, Baker Ctr Bioinformat & Biol Stat, Ames, IA USA

来源：

JOURNAL OF MOLECULAR MODELING | 2006年 / 12卷 / 04期

关键词：

kinase substrate prediction; profile-profile sequence similarity; local structural segments; linear functional motifs; Swiss-Prot database; support vector machine (SVM);

D O I：

10.1007/s00894-005-0070-2

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Our algorithm predicts short linear functional motifs in proteins using only sequence information. Statistical models for short linear functional motifs in proteins are built using the database of short sequence fragments taken from proteins in the current release of the Swiss-Prot database. Those segments are confirmed by experiments to have single-residue post-translational modification. The sensitivities of the classification for various types of short linear motifs are in the range of 70%. The query protein sequence is dissected into short overlapping fragments. All segments are represented as vectors. Each vector is then classified by a machine learning algorithm (Support Vector Machine) as potentially modifiable or not. The resulting list of plausible post-translational sites in the query protein is returned to the user. We also present a study of the human protein kinase C family as a biological application of our method.

引用

页码：453 / 461

页数：9

共 30 条

[1] PRINTS and its automatic supplement, prePRINTS [J].