NLStradamus: a simple Hidden Markov Model for nuclear localization signal prediction

被引:466
作者
Ba, Alex N. Nguyen [1 ,2 ]
Pogoutse, Anastassia [1 ]
Provart, Nicholas [1 ,2 ]
Moses, Alan M. [1 ,2 ]
机构
[1] Univ Toronto, Dept Cell & Syst Biol, Toronto, ON, Canada
[2] Univ Toronto, Ctr Anal Genome Evolut & Funct, Toronto, ON, Canada
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
IMPORTIN-ALPHA; KARYOPHERIN ALPHA; PORE COMPLEX; TRANSPORT; RECOGNITION; DISCOVERY; PROTEINS; GENE; BETA;
D O I
10.1186/1471-2105-10-202
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Nuclear localization signals (NLSs) are stretches of residues within a protein that are important for the regulated nuclear import of the protein. Of the many import pathways that exist in yeast, the best characterized is termed the 'classical' NLS pathway. The classical NLS contains specific patterns of basic residues and computational methods have been designed to predict the location of these motifs on proteins. The consensus sequences, or patterns, for the other import pathways are less well-understood. Results: In this paper, we present an analysis of characterized NLSs in yeast, and find, despite the large number of nuclear import pathways, that NLSs seem to show similar patterns of amino acid residues. We test current prediction methods and observe a low true positive rate. We therefore suggest an approach using hidden Markov models (HMMs) to predict novel NLSs in proteins. We show that our method is able to consistently find 37% of the NLSs with a low false positive rate and that our method retains its true positive rate outside of the yeast data set used for the training parameters. Conclusion: Our implementation of this model, NLStradamus, is made available at: http://www.moseslab.csb.utoronto.ca/NLStradamus/
引用
收藏
页数:11
相关论文
共 27 条
  • [1] A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
    BAUM, LE
    PETRIE, T
    SOULES, G
    WEISS, N
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01): : 164 - &
  • [2] Evaluation of gene structure prediction programs
    Burset, M
    Guigo, R
    [J]. GENOMICS, 1996, 34 (03) : 353 - 367
  • [3] Finding nuclear localization signals
    Cokol, M
    Nair, R
    Rost, B
    [J]. EMBO REPORTS, 2000, 1 (05) : 411 - 415
  • [4] EXTENSIVE MUTAGENESIS OF THE NUCLEAR LOCATION SIGNAL OF SIMIAN VIRUS-40 LARGE-T ANTIGEN
    COLLEDGE, WH
    RICHARDSON, WD
    EDGE, MD
    SMITH, AE
    [J]. MOLECULAR AND CELLULAR BIOLOGY, 1986, 6 (11) : 4136 - 4139
  • [5] Crystallographic analysis of the recognition of a nuclear localization signal by the nuclear import factor karyopherin α
    Conti, E
    Uy, M
    Leighton, L
    Blobel, G
    Kuriyan, J
    [J]. CELL, 1998, 94 (02) : 193 - 204
  • [6] Dyskeratosis congenita in all its forms
    Dokal, I
    [J]. BRITISH JOURNAL OF HAEMATOLOGY, 2000, 110 (04) : 768 - 779
  • [7] Durbin R., 1998, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
  • [8] Profile hidden Markov models
    Eddy, SR
    [J]. BIOINFORMATICS, 1998, 14 (09) : 755 - 763
  • [9] The Pfam protein families database
    Finn, Robert D.
    Tate, John
    Mistry, Jaina
    Coggill, Penny C.
    Sammut, Stephen John
    Hotz, Hans-Rudolf
    Ceric, Goran
    Forslund, Kristoffer
    Eddy, Sean R.
    Sonnhammer, Erik L. L.
    Bateman, Alex
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D281 - D288
  • [10] Structural basis of recognition of monopartite and bipartite nuclear localization sequences by mammalian importin-α
    Fontes, MRM
    Teh, T
    Kobe, B
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (05) : 1183 - 1194