NovoHMM: A hidden Markov model for de novo peptide sequencing

被引:114
作者
Fischer, B [1 ]
Roth, V
Roos, F
Grossmann, J
Baginsky, S
Widmayer, P
Gruissem, W
Buhmann, JM
机构
[1] Swiss Fed Inst Technol, Inst Comp Sci, Inst Plant Sci, Zurich, Switzerland
[2] Swiss Fed Inst Technol, Inst Theoret Comp Sci, Zurich, Switzerland
关键词
D O I
10.1021/ac0508853
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
De novo, sequencing of peptides poses one of the most challenging tasks in data analysis for proteome research. In this paper, a generative hidden Markov model (HMM) of mass spectra for de novo, peptide sequencing which constitutes a novel view on how to solve this problem in a Bayesian framework is proposed. Further extensions of the model structure to a graphical model and a factorial HMM to substantially improve the peptide identification results are demonstrated. Inference with the graphical model for de novo peptide sequencing estimates posterior probabilities for amino acids rather than scores for single symbols in the sequence. Our model outperforms state-of-the-art methods for de novo peptide sequencing on a large test set of spectra.
引用
收藏
页码:7265 / 7273
页数:9
相关论文
共 23 条
  • [1] Mass spectrometry-based proteomics
    Aebersold, R
    Mann, M
    [J]. NATURE, 2003, 422 (6928) : 198 - 207
  • [2] Bafna V, 2001, Bioinformatics, V17 Suppl 1, pS13
  • [3] Baum L.E., 1972, Inequalities III: Proceedings of the Third Symposium on Inequalities, page, V3, P1
  • [4] BERN M, 2005, LNCS, V3500
  • [5] Automatic Quality Assessment of Peptide Tandem Mass Spectra
    Bern, Marshall
    Goldberg, David
    McDonald, W. Hayes
    Yates, John R., III
    [J]. BIOINFORMATICS, 2004, 20 : 49 - 54
  • [6] A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry
    Chen, T
    Kao, MY
    Tepel, M
    Rush, J
    Church, GM
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (03) : 325 - 337
  • [7] OLAV: Towards high-throughput tandem mass spectrometry data identification
    Colinge, J
    Masselot, A
    Giron, M
    Dessingy, T
    Magnin, J
    [J]. PROTEOMICS, 2003, 3 (08) : 1454 - 1463
  • [8] De novo peptide sequencing via tandem mass spectrometry
    Dancík, V
    Addona, TA
    Clauser, KR
    Vath, JE
    Pevzner, PA
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) : 327 - 342
  • [9] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [10] DURBIN R, 1999, BIOL SEQUENCE ANAL