UniProt archive

被引:131
作者
Leinonen, R [1 ]
Diez, FG [1 ]
Binns, D [1 ]
Fleischmann, W [1 ]
Lopez, R [1 ]
Apweiler, R [1 ]
机构
[1] European Bioinformat Inst, EMBL Outstn, Cambridge CB10 1SD, England
关键词
D O I
10.1093/bioinformatics/bth191
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
UniProt Archive (UniParc) is the most comprehensive, non-redundant protein sequence database available. Its protein sequences are retrieved from predominant, publicly accessible resources. All new and updated protein sequences are collected and loaded daily into UniParc for full coverage. To avoid redundancy, each unique sequence is stored only once with a stable protein identifier, which can be used later in UniParc to identify the same protein in all source databases. When proteins are loaded into the database, database cross-references are created to link them to the origins of the sequences. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databases cross-referenced by UniParc. UniParc contains only protein sequences and database cross-references; all other information must be retrieved from the source databases.
引用
收藏
页码:3236 / 3237
页数:2
相关论文
共 15 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[3]  
Etzold T, 1996, METHOD ENZYMOL, V266, P114
[4]   The FlyBase database of the Drosophila genome projects and community literature [J].
Gelbart, W ;
Bayraktaroglu, L ;
Bettencourt, B ;
Campbell, K ;
Crosby, M ;
Emmert, D ;
Hradecky, P ;
Huang, Y ;
Letovsky, S ;
Matthews, B ;
Russo, S ;
Schroeder, A ;
Smutniak, F ;
Zhou, P ;
Zytkovicz, M ;
Ashburner, M ;
Drysdale, R ;
de Grey, A ;
Foulger, R ;
Millburn, G ;
Yamada, C ;
Kaufman, T ;
Matthews, K ;
Gilbert, D ;
Grumbling, G ;
Strelets, V ;
Shemen, C ;
Rubin, G ;
Berman, B ;
Frise, E ;
Gibson, M ;
Harris, N ;
Kaminker, J ;
Lewis, S ;
Marshall, B ;
Misra, S ;
Mungall, C ;
Prochnik, S ;
Richter, J ;
Smith, C ;
Shu, S ;
Tupy, J ;
Wiel, C .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :172-175
[5]   WormBase: a cross-species database for comparative genomics [J].
Harris, TW ;
Lee, R ;
Schwarz, E ;
Bradnam, K ;
Lawson, D ;
Chen, W ;
Blasier, D ;
Kenny, E ;
Cunningham, F ;
Kishore, R ;
Chan, J ;
Muller, HM ;
Petcherski, A ;
Thorisson, G ;
Day, A ;
Bieri, T ;
Rogers, A ;
Chen, CK ;
Spieth, J ;
Sternberg, P ;
Durbin, R ;
Stein, LD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :133-137
[6]   The Ensembl genome database project [J].
Hubbard, T ;
Barker, D ;
Birney, E ;
Cameron, G ;
Chen, Y ;
Clark, L ;
Cox, T ;
Cuff, J ;
Curwen, V ;
Down, T ;
Durbin, R ;
Eyras, E ;
Gilbert, J ;
Hammond, M ;
Huminiecki, L ;
Kasprzyk, A ;
Lehvaslaiho, H ;
Lijnzaad, P ;
Melsopp, C ;
Mongin, E ;
Pettett, R ;
Pocock, M ;
Potter, S ;
Rust, A ;
Schmidt, E ;
Searle, S ;
Slater, G ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Stupka, E ;
Ureta-Vidal, A ;
Vastrik, I ;
Clamp, M .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :38-41
[7]   METHODS FOR ASSESSING THE STATISTICAL SIGNIFICANCE OF MOLECULAR SEQUENCE FEATURES BY USING GENERAL SCORING SCHEMES [J].
KARLIN, S ;
ALTSCHUL, SF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (06) :2264-2268
[8]   The EMBL sequence version archive [J].
Leinonen, R ;
Nardone, F ;
Oyewole, O ;
Redaschi, N ;
Stoehr, P .
BIOINFORMATICS, 2003, 19 (14) :1861-1862
[9]   IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON [J].
PEARSON, WR ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) :2444-2448
[10]   RefSeq and LocusLink: NCBI gene-centered resources [J].
Pruitt, KD ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :137-140