LocTree3 prediction of localization

被引:224
作者
Goldberg, Tatyana [1 ,2 ]
Hecht, Maximilian [1 ]
Hamp, Tobias [1 ]
Karl, Timothy [1 ]
Yachdav, Guy [1 ,3 ]
Ahmed, Nadeem [1 ]
Altermann, Uwe [1 ]
Angerer, Philipp [1 ]
Ansorge, Sonja [1 ]
Balasz, Kinga [1 ]
Bernhofer, Michael [1 ]
Betz, Alexander [1 ]
Cizmadija, Laura [1 ]
Do, Kieu Trinh [1 ]
Gerke, Julia [1 ]
Greil, Robert [1 ]
Joerdens, Vadim [1 ]
Hastreiter, Maximilian [1 ]
Hembach, Katharina [1 ]
Herzog, Max [1 ]
Kalemanov, Maria [1 ]
Kluge, Michael [1 ]
Meier, Alice [1 ]
Nasir, Hassan [1 ]
Neumaier, Ulrich [1 ]
Prade, Verena [1 ]
Reeb, Jonas [1 ]
Sorokoumov, Aleksandr [1 ]
Troshani, Ilira [1 ]
Vorberg, Susann [1 ]
Waldraff, Sonja [1 ]
Zierer, Jonas [1 ]
Nielsen, Henrik [4 ]
Rost, Burkhard [1 ,3 ,5 ,6 ,7 ,8 ]
机构
[1] Tech Univ Munich, Dept Informat, D-85748 Garching, Germany
[2] TUM Grad Sch, Ctr Doctoral Studies Informat & its Applicat CeDo, D-85748 Garching, Germany
[3] Biosof LLC, New York, NY 10001 USA
[4] Tech Univ Denmark, Ctr Biol Sequence Anal, Dept Syst Biol, DK-2800 Lyngby, Denmark
[5] Tech Univ Munich, Inst Adv Study, D-85748 Garching, Germany
[6] Columbia Univ, NYCOMPS, New York, NY 10032 USA
[7] Columbia Univ, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[8] Inst Food & Plant Sci WZW Weihenstephan, D-85350 Freising Weihenstephan, Germany
关键词
PROTEIN LOCALIZATION; SEQUENCE; DATABASE;
D O I
10.1093/nar/gku396
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 +/- 3% for eukaryotes and a six-state accuracy Q6 = 89 +/- 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.
引用
收藏
页码:W350 / W355
页数:6
相关论文
共 27 条
[1]  
Alberts B., 2002, The shape and structure of proteins, Vfourth, DOI 10.1093/aob/mcg023
[2]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL [J].
Bairoch, A ;
Apweller, R .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :31-36
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]   YLoc-an interpretable web server for predicting subcellular localization [J].
Briesemeister, Sebastian ;
Rahnenfuehrer, Joerg ;
Kohlbacher, Oliver .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W497-W502
[7]   The UniProt-GO Annotation database in 2011 [J].
Dimmer, Emily C. ;
Huntley, Rachael P. ;
Alam-Faruque, Yasmin ;
Sawford, Tony ;
O'Donovan, Claire ;
Martin, Maria J. ;
Bely, Benoit ;
Browne, Paul ;
Chan, Wei Mun ;
Eberhardt, Ruth ;
Gardner, Michael ;
Laiho, Kati ;
Legge, Duncan ;
Magrane, Michele ;
Pichler, Klemens ;
Poggioli, Diego ;
Sehra, Harminder ;
Auchincloss, Andrea ;
Axelsen, Kristian ;
Blatter, Marie-Claude ;
Boutet, Emmanuel ;
Braconi-Quintaje, Silvia ;
Breuza, Lionel ;
Bridge, Alan ;
Coudert, Elizabeth ;
Estreicher, Anne ;
Famiglietti, Livia ;
Ferro-Rojas, Serenella ;
Feuermann, Marc ;
Gos, Arnaud ;
Gruaz-Gumowski, Nadine ;
Hinz, Ursula ;
Hulo, Chantal ;
James, Janet ;
Jimenez, Silvia ;
Jungo, Florence ;
Keller, Guillaume ;
Lemercier, Phillippe ;
Lieberherr, Damien ;
Masson, Patrick ;
Moinat, Madelaine ;
Pedruzzi, Ivo ;
Poux, Sylvain ;
Rivoire, Catherine ;
Roechert, Bernd ;
Schneider, Michael ;
Stutz, Andre ;
Sundaram, Shyamala ;
Tognolli, Michael ;
Bougueleret, Lydie .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D565-D570
[8]   LocTree2 predicts localization for all domains of life [J].
Goldberg, Tatyana ;
Hamp, Tobias ;
Rost, Burkhard .
BIOINFORMATICS, 2012, 28 (18) :I458-I465
[9]   Accelerating the Original Profile Kernel [J].
Hamp, Tobias ;
Goldberg, Tatyana ;
Rost, Burkhard .
PLOS ONE, 2013, 8 (06)
[10]   WoLF PSORT: protein localization predictor [J].
Horton, Paul ;
Park, Keun-Joon ;
Obayashi, Takeshi ;
Fujita, Naoya ;
Harada, Hajime ;
Adams-Collier, C. J. ;
Nakai, Kenta .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W585-W587