Predicting subcellular localization of proteins based on their N-terminal amino acid sequence

被引:3741
作者
Emanuelsson, O
Nielsen, H
Brunak, S
von Heijne, G [1 ]
机构
[1] Univ Stockholm, Dept Biochem, Stockholm Bioinformat Ctr, S-10691 Stockholm, Sweden
[2] Tech Univ Denmark, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
关键词
protein sorting; genome annotation; neural networks; targeting sequences; cleavage sites;
D O I
10.1006/jmbi.2000.3903
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-terminal sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and "other" localizations with a success rate of 85 % (plant) or 90 % (non-plant) on redundancy-reduced test sets. From a TargetP analysis. of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens protein set, we estimate that 10 % of all plant proteins are mitochondrial and 14 % chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10 %. TargetP also predicts cleavage sites with levels of correctly predicted,sites ranging from approximately 40 % to 50 % (chloroplastic and mitochondrial presequences) to above 70 % (secretory signal peptides). TargetP is available as a web-server at http://www.cbs.dtu.dk/services/TargetP/. (C) 2000 Academic Press.
引用
收藏
页码:1005 / 1016
页数:12
相关论文
共 53 条
  • [1] ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES
    ALTSCHUL, SF
    BOGUSKI, MS
    GISH, W
    WOOTTON, JC
    [J]. NATURE GENETICS, 1994, 6 (02) : 119 - 129
  • [2] Altschul SF, 1996, METHOD ENZYMOL, V266, P460
  • [3] Adaptation of protein surfaces to subcellular location
    Andrade, MA
    O'Donoghue, SI
    Rost, B
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (02) : 517 - 525
  • [4] ARRETZ M, 1991, BIOMED BIOCHIM ACTA, V50, P403
  • [5] Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
  • [6] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [7] PREDICTION OF HUMAN MESSENGER-RNA DONOR AND ACCEPTOR SITES FROM THE DNA-SEQUENCE
    BRUNAK, S
    ENGELBRECHT, J
    KNUDSEN, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1991, 220 (01) : 49 - 65
  • [8] Protein subcellular location prediction
    Chou, KC
    Elrod, DW
    [J]. PROTEIN ENGINEERING, 1999, 12 (02): : 107 - 118
  • [9] CLAROS MG, 1995, COMPUT APPL BIOSCI, V11, P441
  • [10] Computational method to predict mitochondrially imported proteins and their targeting sequences
    Claros, MG
    Vincens, P
    [J]. EUROPEAN JOURNAL OF BIOCHEMISTRY, 1996, 241 (03): : 779 - 786