A New Machine Learning Approach for Protein Phosphorylation Site Prediction in Plants

被引:0
作者
Gao, Jianjiong [1 ,2 ]
Agrawal, Ganesh Kumar [2 ,3 ]
Thelen, Jay J. [2 ,3 ]
Obradovic, Zoran [4 ]
Dunker, A. Keith [5 ]
Xu, Dong [1 ,2 ]
机构
[1] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
[2] C.S. Bond Life Sci Ctr, Columbia, MO 65211 USA
[3] Univ Missouri, Dept Biochem, Columbia, MO 65211 USA
[4] Temple Univ, Ctr Informat Sci & Technol, Philadelphia, PA 19122 USA
[5] Indiana Univ Sch Med Informat, Ctr Computat Biol Bioinformat, Indianapolis, IN 46202 USA
来源
BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, PROCEEDINGS | 2009年 / 5462卷
基金
美国国家科学基金会;
关键词
Protein Phosphorylation; Phosphoproteomics; Arabidopsis; Protein Disorder; KNN; SVM; MASS-SPECTROMETRY; ARABIDOPSIS; DATABASE; INFORMATION; SEQUENCE; UPDATE; TOOL;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein phosphorylation is a crucial regulatory mechanism in various organisms. With recent improvements in mass spectrometry, phosphorylation site data are rapidly accumulating. Despite this wealth of data, computational prediction of phosphorylation sites remains a challenging task. This is particularly true in plants, due to the limited information on substrate specificities of protein kinases in plants and the fact that current phosphorylation prediction tools are trained with kinase-specific phosphorylation data from non-plant organisms. In this paper, we proposed a new machine learning approach for phosphorylation site prediction. We incorporate protein sequence information and protein disordered regions, and integrate machine learning techniques of k-nearest neighbor and support vector machine for predicting phosphorylation sites. Test results on the PhosPhAt dataset of phosphoserines in Arabidopsis and the TAIR7 non-redundant protein database show good performance of our proposed phosphorylation site prediction method.
引用
收藏
页码:18 / +
页数:4
相关论文
共 28 条
  • [1] Quantitative phosphoproteomics of early elicitor signaling in Arabidopsis
    Benschop, Joris J.
    Mohammed, Shabaz
    O'Flaherty, Martina
    Heck, Albert J. R.
    Slijper, Monique
    Menke, Frank L. H.
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (07) : 1198 - 1214
  • [2] Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence
    Blom, N
    Sicheritz-Pontén, T
    Gupta, R
    Gammeltoft, S
    Brunak, S
    [J]. PROTEOMICS, 2004, 4 (06) : 1633 - 1649
  • [3] Analysis of phosphorylation sites on proteins from Saccharomyces cerevisiae by electron transfer dissociation (ETD) mass spectrometry
    Chi, An
    Huttenhower, Curtis
    Geer, Lewis Y.
    Coon, Joshua J.
    Syka, John E. P.
    Bai, Dina L.
    Shabanowitz, Jeffrey
    Burke, Daniel J.
    Troyanskaya, Olga G.
    Hunt, Donald F.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (07) : 2193 - 2198
  • [4] Prediction of kinase-specific phosphorylation sites using conditional random fields
    Dang, Thanh Hai
    Van Leemput, Koenraad
    Verschoren, Alain
    Laukens, Kris
    [J]. BIOINFORMATICS, 2008, 24 (24) : 2857 - 2864
  • [5] Phospho.ELM: a database of phosphorylation sites - update 2008
    Diella, Francesca
    Gould, Cathryn M.
    Chica, Claudia
    Via, Allegra
    Gibson, Toby J.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D240 - D244
  • [6] The unfoldomics decade: an update on intrinsically disordered proteins
    Dunker, A. Keith
    Oldfield, Christopher J.
    Meng, Jingwei
    Romero, Pedro
    Yang, Jack Y.
    Chen, Jessica Walton
    Vacic, Vladimir
    Obradovic, Zoran
    Uversky, Vladimir N.
    [J]. BMC GENOMICS, 2008, 9 (Suppl 2)
  • [7] P3DB: a plant protein phosphorylation database
    Gao, Jianjiong
    Agrawal, Ganesh Kumar
    Thelen, Jay J.
    Xu, Dong
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D960 - D962
  • [8] PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites
    Gnad, Florian
    Ren, Shubin
    Cox, Juergen
    Olsen, Jesper V.
    Macek, Boris
    Oroshi, Mario
    Mann, Matthias
    [J]. GENOME BIOLOGY, 2007, 8 (11)
  • [9] PhosPhAt:: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor
    Heazlewood, Joshua L.
    Durek, Pawel
    Hummel, Jan
    Selbig, Joachim
    Weckwerth, Wolfram
    Walther, Dirk
    Schulze, Waltraud X.
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D1015 - D1021
  • [10] AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS
    HENIKOFF, S
    HENIKOFF, JG
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) : 10915 - 10919