WebSTR: A Population-wide Database of Short Tandem Repeat Variation in Humans

被引:12
作者
Lundstrom, Oxana [1 ,2 ,3 ]
Verbiest, Max Adriaan [3 ,4 ,5 ]
Xia, Feifei [3 ,4 ,5 ]
Jam, Helyaneh Ziaei [7 ]
Zlobec, Inti [6 ]
Anisimova, Maria [3 ,4 ]
Gymrek, Melissa [7 ,8 ]
机构
[1] Stockholm Univ, Dept Biochem & Biophys, Stockholm, Sweden
[2] Vildly AB, Kalmar, Sweden
[3] Zurich Univ Appl Sci ZHAW, Inst Computat Life Sci, Sch Life Sci & Facil Management, Wadenswil, Switzerland
[4] Swiss Inst Bioinformat SIB, Lausanne, Switzerland
[5] Univ Zurich, Dept Mol Life Sci, Zurich, Switzerland
[6] Univ Bern, Inst Tissue Med & Pathol, Bern, Switzerland
[7] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[8] Univ Calif San Diego, Dept Med, La Jolla, CA USA
基金
瑞士国家科学基金会;
关键词
human genetic variation; short tandem repeats; database; API; web portal; GENE-EXPRESSION;
D O I
10.1016/j.jmb.2023.168260
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Short tandem repeats (STRs) are consecutive repetitions of one to six nucleotide motifs. They are hyper -variable due to the high prevalence of repeat unit insertions or deletions primarily caused by polymerase slippage during replication. Genetic variation at STRs has been shown to influence a range of traits in humans, including gene expression, cancer risk, and autism. Until recently STRs have been poorly studied since they pose significant challenges to bioinformatics analyses. Moreover, genome-wide analysis of STR variation in population-scale cohorts requires large amounts of data and computational resources. However, the recent advent of genome-wide analysis tools has resulted in multiple large genome-wide datasets of STR variation spanning nearly two million genomic loci in thousands of individuals from diverse populations. Here we present WebSTR, a database of genetic variation and other characteris-tics of genome-wide STRs across human populations. WebSTR is based on reference panels of more than 1.7 million human STRs created with state of the art repeat annotation methods and can easily be extended to include additional cohorts or species. It currently contains data based on STR genotypes for individuals from the 1000 Genomes Project, H3Africa, the Genotype-Tissue Expression (GTEx) Project and colorectal cancer patients from the TCGA dataset. WebSTR is implemented as a relational database with programmatic access available through an API and a web portal for browsing data. The web portal is publicly available at https://webstr.ucsd.edu.(c) 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecom-mons.org/licenses/by/4.0/).
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Sequence variation of a variable short tandem repeat at the D18S535 locus
    Lareu, MV
    Barral, S
    Salas, A
    Carracedo, A
    INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 1998, 111 (06) : 337 - 339
  • [32] Sequence variation of a hypervariable short tandem repeat at the D12S391 locus
    Lareu, MV
    Pestoni, MD
    Barros, F
    Salas, A
    Carracedo, A
    GENE, 1996, 182 (1-2) : 151 - 153
  • [33] Sequence variation of a hypervariable short tandem repeat at the D1S1656 locus
    Lareu, MV
    Barral, S
    Salas, A
    Pestoni, C
    Carracedo, A
    INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 1998, 111 (05) : 244 - 247
  • [34] Sequence variation of a hypervariable short tandem repeat at the D1S1656 locus
    M. V. Lareu
    S. Barral
    A. Salas
    C. Pestoni
    A. Carracedo
    International Journal of Legal Medicine, 1998, 111 : 244 - 247
  • [35] Y chromosomal short tandem repeat (STR) loci in a representative group of males living in South Wurttemberg: a database for application in forensic medicine
    Graw, M
    Seitz, T
    FORENSIC SCIENCE INTERNATIONAL, 2000, 113 (1-3) : 43 - 46
  • [36] Genome-wide sequencing as a first-tier screening test for short tandem repeat expansions
    Indhu-Shree Rajan-Babu
    Junran J. Peng
    Readman Chiu
    Chenkai Li
    Arezoo Mohajeri
    Egor Dolzhenko
    Michael A. Eberle
    Inanc Birol
    Jan M. Friedman
    Genome Medicine, 13
  • [37] Development and population study of an eight-locus short tandem repeat (STR) multiplex system
    Lins, AM
    Micka, KA
    Sprecher, CJ
    Taylor, JA
    Bacher, JW
    Rabbach, DR
    Bever, RA
    Creacy, SD
    Schumm, JW
    JOURNAL OF FORENSIC SCIENCES, 1998, 43 (06) : 1168 - 1180
  • [38] A study of four short tandem repeat systems:: African immigrant, Portuguese and Spanish population data
    Gamero, JJ
    Romero, JL
    González, JL
    Carvalho, M
    Anjos, MJ
    Corte-Real, F
    Vide, MC
    PROGRESS IN FORENSIC GENETICS 9, 2003, 1239 : 159 - 162
  • [39] A study on four short tandem repeat systems:: African immigrant, Portuguese, and Spanish population data
    Gamero, JJ
    Romero, JL
    González, JL
    Carvalho, M
    Anjos, MJ
    Real, FC
    Vide, MC
    JOURNAL OF FORENSIC SCIENCES, 2002, 47 (04) : 897 - 899
  • [40] Mutation rates in 21 autosomal short tandem repeat loci in a population from Goias, Brazil
    Vieira, Thais Cidalia
    Duarte Gigonzac, Marc Alexandre
    Rodovalho, Ricardo Goulart
    Cavalcanti, Luana Morais
    Minasi, Lysa Bernardes
    Rodrigues, Flavia Melo
    da Cruz, Aparecido Divino
    ELECTROPHORESIS, 2017, 38 (21) : 2791 - 2794