Consequences of the discontinuation of the International Protein Index (IPI) database and its substitution by the UniProtKB "complete proteome" sets

被引：25

作者：

Griss, Johannes ^{[1
,2
]}

Martin, Maria ^{[1
]}

O'Donovan, Claire ^{[1
]}

Apweiler, Rolf ^{[1
]}

Hermjakob, Henning ^{[1
]}

Vizcaino, Juan Antonio ^{[1
]}

机构：

[1] EMBL European Bioinformat Inst, Cambridge CB10 1SD, England

[2] Med Univ Vienna, Dept Med 1, Vienna, Austria

来源：

PROTEOMICS | 2011年 / 11卷 / 22期

基金：

英国惠康基金;

关键词：

Bioinformatics; Discontinuation; Gene annotation; International Protein Index; Protein databases; UniProt Knowledgebase; RESOURCE;

D O I：

10.1002/pmic.201100363

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

The International Protein Index (IPI) database has been one of the most widely used protein databases in MS proteomics approaches. Recently, the closure of IPI in September 2011 was announced. Its recommended replacement is the new UniProt Knowledgebase (UniProtKB) "complete proteome" sets, launched in May 2011. Here, we analyze the consequences of IPI's discontinuation for human and mouse data, and the effect of its substitution with UniProtKB on two levels: (i) data already produced and (ii) newly performed experiments. To estimate the effect on existing data, we investigated how well IPI identifiers map to UniProtKB accessions. We found that 21% of human and 10% of mouse identifiers do not map to UniProtKB and would thus be "lost." To investigate the impact on new experiments, we compared the theoretical search space (i. e. the tryptic peptides) of both resources and found that it is decreased by 14.0% for human and 8.9% for mouse data through IPI's closure. An analysis on the experimental evidence for these "lost" peptides showed that the vast majority has not been identified in experiments available in the major proteomics repositories. It thus seems likely that the search space provided by UniProtKB is of higher quality than the one currently provided by IPI.

引用

页码：4434 / 4438

页数：5

共 16 条

[1] Ongoing and future developments at the Universal Protein Resource [J].

Apweiler, Rolf ;

Martin, Maria Jesus ;

O'Donovan, Claire ;

Magrane, Michele ;

Alam-Faruque, Yasmin ;

Antunes, Ricardo ;

Barrell, Daniel ;

Bely, Benoit ;

Bingley, Mark ;

Binns, David ;

Bower, Lawrence ;

Browne, Paul ;

Chan, Wei Mun ;

Dimmer, Emily ;

Eberhardt, Ruth ;

Fazzini, Francesco ;

Fedotov, Alexander ;

Foulger, Rebecca ;

Garavelli, John ;

Castro, Leyla Garcia ;

Huntley, Rachael ;

Jacobsen, Julius ;

Kleen, Michael ;

Laiho, Kati ;

Legge, Duncan ;

Lin, Quan ;

Liu, Wudong ;

Luo, Jie ;

Orchard, Sandra ;

Patient, Samuel ;

Pichler, Klemens ;

Poggioli, Diego ;

Pontikos, Nikolas ;

Pruess, Manuela ;

Rosanoff, Steven ;

Sawford, Tony ;

Sehra, Harminder ;

Turner, Edward ;

Corbett, Matt ;

Donnelly, Mike ;

van Rensburg, Pieter ;

Xenarios, Ioannis ;

Bougueleret, Lydie ;

Auchincloss, Andrea ;

Argoud-Puy, Ghislaine ;

Axelsen, Kristian ;

Bairoch, Amos ;

Baratin, Delphine ;

Blatter, Marie-Claude ;

Boeckmann, Brigitte .

NUCLEIC ACIDS RESEARCH, 2011, 39 :D214-D219

[2]

Bell AW, 2009, NAT METHODS, V6, P423, DOI [10.1038/NMETH.1333, 10.1038/nmeth.1333]

[3] The Protein Identifier Cross-Referencing (PICR) service:: reconciling protein identifiers across multiple source databases [J].

Cote, Richard G. ;

Jones, Philip ;

Martens, Lennart ;

Kerrien, Samuel ;

Reisinger, Florian ;

Lin, Quan ;

Leinonen, Rasko ;

Apweiler, Rolf ;

Hermjakob, Henning .

BMC BIOINFORMATICS, 2007, 8 (1) :401

[4] Open source system for analyzing, validating, and storing protein identification data [J].

Craig, R ;

Cortens, JP ;

Beavis, RC .

JOURNAL OF PROTEOME RESEARCH, 2004, 3 (06) :1234-1242

[5] PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows [J].

Deutsch, Eric W. ;

Lam, Henry ;

Aebersold, Ruedi .

EMBO REPORTS, 2008, 9 (05) :429-434

[6] Ensembl 2011 [J].

Flicek, Paul ;

Amode, M. Ridwan ;

Barrell, Daniel ;

Beal, Kathryn ;

Brent, Simon ;

Chen, Yuan ;

Clapham, Peter ;

Coates, Guy ;

Fairley, Susan ;

Fitzgerald, Stephen ;

Gordon, Leo ;

Hendrix, Maurice ;

Hourlier, Thibaut ;

Johnson, Nathan ;

Kaehaeri, Andreas ;

Keefe, Damian ;

Keenan, Stephen ;

Kinsella, Rhoda ;

Kokocinski, Felix ;

Kulesha, Eugene ;

Larsson, Pontus ;

Longden, Ian ;

McLaren, William ;

Overduin, Bert ;

Pritchard, Bethan ;

Riat, Harpreet Singh ;

Rios, Daniel ;

Ritchie, Graham R. S. ;

Ruffier, Magali ;

Schuster, Michael ;

Sobral, Daniel ;

Spudich, Giulietta ;

Tang, Y. Amy ;

Trevanion, Stephen ;

Vandrovcova, Jana ;

Vilella, Albert J. ;

White, Simon ;

Wilder, Steven P. ;

Zadissa, Amonida ;

Zamora, Jorge ;

Aken, Bronwen L. ;

Birney, Ewan ;

Cunningham, Fiona ;

Dunham, Ian ;

Durbin, Richard ;

Fernandez-Suarez, Xose M. ;

Herrero, Javier ;

Hubbard, Tim J. P. ;

Parker, Anne ;

Proctor, Glenn .

NUCLEIC ACIDS RESEARCH, 2011, 39 :D800-D806

[7] Published and Perished? The Influence of the Searched Protein Database on the Long-Term Storage of Proteomics Data [J].

Griss, Johannes ;

Cote, Richard G. ;

Gerner, Christopher ;

Hermjakob, Henning ;

Vizcaino, Juan Antonio .

MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (09)

[8] The International Protein Index: An integrated database for proteomics experiments [J].

Kersey, PJ ;

Duarte, J ;

Williams, A ;

Karavidopoulou, Y ;

Birney, E ;

Apweiler, R .

PROTEOMICS, 2004, 4 (07) :1985-1988

[9] The Human Proteome Project: Current State and Future Direction [J].

Legrain, Pierre ;

Aebersold, Ruedi ;

Archakov, Alexander ;

Bairoch, Amos ;

Bala, Kumar ;

Beretta, Laura ;

Bergeron, John ;

Borchers, Christoph H. ;

Corthals, Garry L. ;

Costello, Catherine E. ;

Deutsch, Eric W. ;

Domon, Bruno ;

Hancock, William ;

He, Fuchu ;

Hochstrasser, Denis ;

Marko-Varga, Gyorgy ;

Salekdeh, Ghasem Hosseini ;

Sechi, Salvatore ;

Snyder, Michael ;

Srivastava, Sudhir ;

Uhlen, Mathias ;

Wu, Cathy H. ;

Yamamoto, Tadashi ;

Paik, Young-Ki ;

Omenn, Gilbert S. .

MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (07)

[10] The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes [J].

Pruitt, Kim D. ;

Harrow, Jennifer ;

Harte, Rachel A. ;

Wallin, Craig ;

Diekhans, Mark ;

Maglott, Donna R. ;

Searle, Steve ;

Farrell, Catherine M. ;

Loveland, Jane E. ;

Ruef, Barbara J. ;

Hart, Elizabeth ;

Suner, Marie-Marthe ;

Landrum, Melissa J. ;

Aken, Bronwen ;

Ayling, Sarah ;

Baertsch, Robert ;

Fernandez-Banet, Julio ;

Cherry, Joshua L. ;

Curwen, Val ;

DiCuccio, Michael ;

Kellis, Manolis ;

Lee, Jennifer ;

Lin, Michael F. ;

Schuster, Michael ;

Shkeda, Andrew ;

Amid, Clara ;

Brown, Garth ;

Dukhanina, Oksana ;

Frankish, Adam ;

Hart, Jennifer ;

Maidak, Bonnie L. ;

Mudge, Jonathan ;

Murphy, Michael R. ;

Murphy, Terence ;

Rajan, Jeena ;

Rajput, Bhanu ;

Riddick, Lillian D. ;

Snow, Catherine ;

Steward, Charles ;

Webb, David ;

Weber, Janet A. ;

Wilming, Laurens ;

Wu, Wenyu ;

Birney, Ewan ;

Haussler, David ;

Hubbard, Tim ;

Ostell, James ;

Durbin, Richard ;

Lipman, David .

GENOME RESEARCH, 2009, 19 (07) :1316-1323

← 1 2 →