HGVbase:: a human sequence variation database emphasizing data quality and a broad spectrum of data sources

被引:105
作者
Fredman, D
Siegfried, M
Yuan, YP
Bork, P
Lehväslaiho, H
Brookes, AJ
机构
[1] Karolinska Inst, Ctr Genom & Bioinformat, S-17177 Stockholm, Sweden
[2] European Mol Biol Lab, D-69117 Heidelberg, Germany
[3] European Bioinformat Inst, Hinxton CB10 1SD, Cambs, England
关键词
D O I
10.1093/nar/30.1.387
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
HGVbase (Human Genome Variation database; http://hgvbase.cgb.ki.se, formerly known as HGBASE) is an academic effort to provide a high quality and non-redundant database of available genomic variation data of all types, mostly comprising single nucleotide polymorphisms (SNPs). Records include neutral polymorphisms as well as disease-related mutations. Online search tools facilitate data interrogation by sequence similarity and keyword queries, and searching by genome coordinates is now being implemented. Downloads are freely available in XML, Fasta, SRS, SQL and tagged-text file formats. Each entry is presented in the context of its surrounding sequence and many records are related to neighboring human genes and affected features therein. Population allele frequencies are included wherever available. Thorough semi-automated data checking ensures internal consistency and addresses common errors in the source information. To keep pace with recent growth in the field, we have developed tools for fully automated annotation. All variants have been uniquely mapped to the draft genome sequence and are referenced to positions in EMBL/GenBank files. Data utility is enhanced by provision of genotyping assays and functional predictions. Recent data structure extensions allow the capture of haplotype and genotype information, and a new initiative (along with BiSC and HUGO-MDI) aims to create a central repository for the broad collection of clinical mutations and associated disease phenotypes of interest.
引用
收藏
页码:387 / 391
页数:5
相关论文
共 5 条
[1]  
Cotton RGH, 1998, SCIENCE, V279, P10
[2]  
Etzold T, 1996, METHOD ENZYMOL, V266, P114
[3]  
Pearson W R, 2000, Methods Mol Biol, V132, P185
[4]   Robust and accurate single nucleotide polymorphism genotyping by dynamic allele-specific hybridization (DASH): Design criteria and assay validation [J].
Prince, JA ;
Feuk, L ;
Howell, WM ;
Jobs, M ;
Emahazion, T ;
Blennow, K ;
Brookes, AJ .
GENOME RESEARCH, 2001, 11 (01) :152-162
[5]   dbSNP: the NCBI database of genetic variation [J].
Sherry, ST ;
Ward, MH ;
Kholodov, M ;
Baker, J ;
Phan, L ;
Smigielski, EM ;
Sirotkin, K .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :308-311