RefSeq: an update on mammalian reference sequences

被引:720
作者
Pruitt, Kim D. [1 ]
Brown, Garth R. [1 ]
Hiatt, Susan M. [1 ]
Thibaud-Nissen, Francoise [1 ]
Astashyn, Alexander [1 ]
Ermolaeva, Olga [1 ]
Farrell, Catherine M. [1 ]
Hart, Jennifer [1 ]
Landrum, Melissa J. [1 ]
McGarvey, Kelly M. [1 ]
Murphy, Michael R. [1 ]
O'Leary, Nuala A. [1 ]
Pujar, Shashikant [1 ]
Rajput, Bhanu [1 ]
Rangwala, Sanjida H. [1 ]
Riddick, Lillian D. [1 ]
Shkeda, Andrei [1 ]
Sun, Hanzhen [1 ]
Tamez, Pamela [1 ]
Tully, Raymond E. [1 ]
Wallin, Craig [1 ]
Webb, David [1 ]
Weber, Janet [1 ]
Wu, Wendy [1 ]
DiCuccio, Michael [1 ]
Kitts, Paul [1 ]
Maglott, Donna R. [1 ]
Murphy, Terence D. [1 ]
Ostell, James M. [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
DATABASE; RESOURCES;
D O I
10.1093/nar/gkt1114
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration (http://www.ncbi.nlm.nih.gov/refseq/). We report here on growth of the mammalian and human subsets, changes to NCBI's eukaryotic annotation pipeline and modifications affecting transcript and protein records. Recent changes to NCBI's eukaryotic genome annotation pipeline provide higher throughput, and the addition of RNAseq data to the pipeline results in a significant expansion of the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent annotation changes include reporting supporting evidence for transcript records, modification of exon feature annotation and the addition of a structured report of gene and sequence attributes of biological interest. We also describe a revised protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins and we summarize the current status of the RefSeqGene project.
引用
收藏
页码:D756 / D763
页数:8
相关论文
共 24 条
[1]  
Acland A, 2013, NUCLEIC ACIDS RES, V41, pD8, DOI [10.1093/nar/gkx1095, 10.1093/nar/gks1189, 10.1093/nar/gkq1172]
[2]   McKusick's Online Mendelian Inheritance in Man (OMIM®) [J].
Amberger, Joanna ;
Bocchini, Carol A. ;
Scott, Alan F. ;
Hamosh, Ada .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D793-D796
[3]   Update on activities at the Universal Protein Resource (UniProt) in 2013 [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Alpi, Emanuela ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dimmer, Emily ;
Fazzini, Francesco ;
Gane, Paul ;
Fedotov, Alexander ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Jones, Rachel ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightingale, Andrew ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D43-D47
[4]   The Mouse Genome Database: Genotypes, Phenotypes, and Models of Human Disease [J].
Bult, Carol J. ;
Eppig, Janan T. ;
Blake, Judith A. ;
Kadin, James A. ;
Richardson, Joel E. .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D885-D891
[5]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[6]   Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans [J].
Calvo, Sarah E. ;
Pagliarini, David J. ;
Mootha, Vamsi K. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (18) :7507-7512
[7]   Modernizing Reference Genome Assemblies [J].
Church, Deanna M. ;
Schneider, Valerie A. ;
Graves, Tina ;
Auger, Katherine ;
Cunningham, Fiona ;
Bouk, Nathan ;
Chen, Hsiu-Chuan ;
Agarwala, Richa ;
McLaren, William M. ;
Ritchie, Graham R. S. ;
Albracht, Derek ;
Kremitzki, Milinn ;
Rock, Susan ;
Kotkiewicz, Holland ;
Kremitzki, Colin ;
Wollam, Aye ;
Trani, Lee ;
Fulton, Lucinda ;
Fulton, Robert ;
Matthews, Lucy ;
Whitehead, Siobhan ;
Chow, Will ;
Torrance, James ;
Dunn, Matthew ;
Harden, Glenn ;
Threadgold, Glen ;
Wood, Jonathan ;
Collins, Joanna ;
Heath, Paul ;
Griffiths, Guy ;
Pelan, Sarah ;
Grafham, Darren ;
Eichler, Evan E. ;
Weinstock, George ;
Mardis, Elaine R. ;
Wilson, Richard K. ;
Howe, Kerstin ;
Flicek, Paul ;
Hubbard, Tim .
PLOS BIOLOGY, 2011, 9 (07)
[8]   Locus Reference Genomic sequences: an improved basis for describing human DNA variants [J].
Dalgleish, Raymond ;
Flicek, Paul ;
Cunningham, Fiona ;
Astashyn, Alex ;
Tully, Raymond E. ;
Proctor, Glenn ;
Chen, Yuan ;
McLaren, William M. ;
Larsson, Pontus ;
Vaughan, Brendan W. ;
Beroud, Christophe ;
Dobson, Glen ;
Lehvaeslaiho, Heikki ;
Taschner, Peter E. M. ;
den Dunnen, Johan T. ;
Devereau, Andrew ;
Birney, Ewan ;
Brookes, Anthony J. ;
Maglott, Donna R. .
GENOME MEDICINE, 2010, 2
[9]   The Rat Genome Database 2009: variation, ontologies and pathways [J].
Dwinell, Melinda R. ;
Worthey, Elizabeth A. ;
Shimoyama, Mary ;
Bakir-Gungor, Burcu ;
DePons, Jeffrey ;
Laulederkind, Stanley ;
Lowry, Timothy ;
Nigram, Rajni ;
Petri, Victoria ;
Smith, Jennifer ;
Stoddard, Alexander ;
Twigger, Simon N. ;
Jacob, Howard J. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D744-D749
[10]   Genenames.org: the HGNC resources in 2013 [J].
Gray, Kristian A. ;
Daugherty, Louise C. ;
Gordon, Susan M. ;
Seal, Ruth L. ;
Wright, Mathew W. ;
Bruford, Elspeth A. .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D545-D552