The statistical significance of protein identification results as a function of the number of protein sequences searched

被引:14
作者
Eriksson, J
Fenyö, D
机构
[1] Swedish Univ Agr Sci, SE-75007 Uppsala, Sweden
[2] Amersham Biosci, Piscataway, NJ 08855 USA
[3] Rockefeller Univ, New York, NY 10021 USA
关键词
protein identification; algorithm; bioinformatics; mass spectrometry; proteomics; protein; peptide; peptide mapping; significance testing; simulation;
D O I
10.1021/pr0499343
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The potential for obtaining a true mass spectrometric protein identification result depends on the choice of algorithm as well as on experimental factors that influence the information content in the mass spectrometric data. Current methods can never prove definitively that a result is true, but an appropriate choice of algorithm can provide a measure of the statistical risk that a result is false, i.e., the statistical significance. We recently demonstrated an algorithm, Probity, which assigns the statistical significance to each result. For any choice of algorithm, the difficulty of obtaining statistically significant results depends on the number of protein sequences in the sequence collection searched. By simulations of random protein identifications and using the Probity algorithm, we here demonstrate explicitly how the statistical significance depends on the number of sequences searched. We also provide an example on how the practitioner's choice of taxonomic constraints influences the statistical significance.
引用
收藏
页码:979 / 982
页数:4
相关论文
共 29 条
  • [1] [Anonymous], 1998, SCIENCE, V282, P2012
  • [2] Dennis C, 2001, NATURE, V409, P813
  • [3] Probity:: A protein identification algorithm with accurate assignment of the statistical significance of the results
    Eriksson, J
    Fenyö, D
    [J]. JOURNAL OF PROTEOME RESEARCH, 2004, 3 (01) : 32 - 36
  • [4] A statistical basis for testing the significance of mass spectrometric protein identification results
    Eriksson, J
    Chait, BT
    Fenyö, D
    [J]. ANALYTICAL CHEMISTRY, 2000, 72 (05) : 999 - 1005
  • [5] Eriksson J, 2002, PROTEOMICS, V2, P262, DOI 10.1002/1615-9861(200203)2:3<262::AID-PROT262>3.0.CO
  • [6] 2-W
  • [7] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512
  • [8] Functional organization of the yeast proteome by systematic analysis of protein complexes
    Gavin, AC
    Bösche, M
    Krause, R
    Grandi, P
    Marzioch, M
    Bauer, A
    Schultz, J
    Rick, JM
    Michon, AM
    Cruciat, CM
    Remor, M
    Höfert, C
    Schelder, M
    Brajenovic, M
    Ruffner, H
    Merino, A
    Klein, K
    Hudak, M
    Dickson, D
    Rudi, T
    Gnau, V
    Bauch, A
    Bastuck, S
    Huhse, B
    Leutwein, C
    Heurtier, MA
    Copley, RR
    Edelmann, A
    Querfurth, E
    Rybin, V
    Drewes, G
    Raida, M
    Bouwmeester, T
    Bork, P
    Seraphin, B
    Kuster, B
    Neubauer, G
    Superti-Furga, G
    [J]. NATURE, 2002, 415 (6868) : 141 - 147
  • [9] Life with 6000 genes
    Goffeau, A
    Barrell, BG
    Bussey, H
    Davis, RW
    Dujon, B
    Feldmann, H
    Galibert, F
    Hoheisel, JD
    Jacq, C
    Johnston, M
    Louis, EJ
    Mewes, HW
    Murakami, Y
    Philippsen, P
    Tettelin, H
    Oliver, SG
    [J]. SCIENCE, 1996, 274 (5287) : 546 - &
  • [10] IDENTIFYING PROTEINS FROM 2-DIMENSIONAL GELS BY MOLECULAR MASS SEARCHING OF PEPTIDE-FRAGMENTS IN PROTEIN-SEQUENCE DATABASES
    HENZEL, WJ
    BILLECI, TM
    STULTS, JT
    WONG, SC
    GRIMLEY, C
    WATANABE, C
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (11) : 5011 - 5015