Detecting short tandem repeats from genome data: opening the software black box

被引:57
作者
Merkel, Angelika [1 ]
Gemmell, Neil [2 ]
机构
[1] Univ Canterbury, Sch Biol Sci, Christchurch 8041, New Zealand
[2] Univ Otago, Dunedin, New Zealand
关键词
microsatellite; tandem repeat; genome; algorithm; software; method; comparison;
D O I
10.1093/bib/bbn028
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Short tandem repeats, specifically microsatellites, are widely used genetic markers, associated with human genetic diseases, and play an important role in various regulatory mechanisms and evolution. Despite their importance, much is yet unknown about their mutational dynamics. The increasing availability of genome data has led to several in silico studies of microsatellite evolution which have produced a vast range of algorithms and software for tandem repeat detection. Documentation of these tools is often sparse, or provided in a format that is impenetrable to most biologists without informatics background. This article introduces the major concepts behind repeat detecting software essential for informed tool selection. We reflect on issues such as parameter settings and program bias, as well as redundancy filtering and efficiency using examples from the currently available range of programs, to provide an integrated comparison and practical guide to microsatellite detecting programs.
引用
收藏
页码:355 / 366
页数:12
相关论文
共 56 条
[1]  
Abajian C., 1994, Sputnik
[2]   EuMicroSatdb:: A database for microsatellites in the sequenced genomes of eukaryotes [J].
Aishwarya, Veenu ;
Grover, Atul ;
Sharma, Prakash C. .
BMC GENOMICS, 2007, 8 :225
[3]   Detecting cryptically simple protein sequences using the SIMPLE algorithm [J].
Albà, MM ;
Laskowski, RA ;
Hancock, JM .
BIOINFORMATICS, 2002, 18 (05) :672-678
[4]  
[Anonymous], 1996, REPEATMASKER
[5]   InSatDb: a microsatellite database of fully sequenced insect genomes [J].
Archak, Sunil ;
Meduri, Eshwar ;
Kumar, P. Sravana ;
Nagaraju, J. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D36-D39
[6]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[7]   Poly: a quantitative analysis tool for simple sequence repeat (SSR) tracts in DNA [J].
Bizzaro, JW ;
Marx, KA .
BMC BIOINFORMATICS, 2003, 4 (1)
[8]   TRbase: a database relating tandem repeats to disease genes for the human genome [J].
Boby, T ;
Patch, AM ;
Aves, SJ .
BIOINFORMATICS, 2005, 21 (06) :811-816
[9]   Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression [J].
Boeva, V ;
Regnier, M ;
Papatsenko, D ;
Makeev, V .
BIOINFORMATICS, 2006, 22 (06) :676-684
[10]   TROLL-Tandem Repeat Occurrence Locator [J].
Castelo, AT ;
Martins, W ;
Gao, GR .
BIOINFORMATICS, 2002, 18 (04) :634-636