Fragment Merger: An Online Tool to Merge Overlapping Long Sequence Fragments
被引:29
作者:
Bell, Trevor G.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Witwatersrand, Fac Hlth Sci, Sch Clin Med, HVDRP,Dept Internal Med, ZA-2050 Johannesburg, South AfricaUniv Witwatersrand, Fac Hlth Sci, Sch Clin Med, HVDRP,Dept Internal Med, ZA-2050 Johannesburg, South Africa
Bell, Trevor G.
[1
]
论文数: 引用数:
h-index:
机构:
Kramvis, Anna
[1
]
机构:
[1] Univ Witwatersrand, Fac Hlth Sci, Sch Clin Med, HVDRP,Dept Internal Med, ZA-2050 Johannesburg, South Africa
来源:
VIRUSES-BASEL
|
2013年
/
5卷
/
03期
基金:
英国医学研究理事会;
新加坡国家研究基金会;
关键词:
sequence data;
sequence fragments;
chromatograms;
DNA assembly;
amplicons;
hepatitis B virus;
HEPATITIS-B-VIRUS;
MUTATIONS;
MUTANTS;
D O I:
10.3390/v5030824
中图分类号:
Q93 [微生物学];
学科分类号:
071005 ;
100705 ;
摘要:
While PCR amplicons extend to a few thousand bases, the length of sequences from direct Sanger sequencing is limited to 500-800 nucleotides. Therefore, several fragments may be required to cover an amplicon, a gene or an entire genome. These fragments are typically sequenced in an overlapping fashion and assembled by manually sliding and aligning the sequences visually. This is time-consuming, repetitive and error-prone, and further complicated by circular genomes. An online tool merging two to twelve long overlapping sequence fragments was developed. Either chromatograms or FASTA files are submitted to the tool, which trims poor quality ends of chromatograms according to user-specified parameters. Fragments are assembled into a single sequence by repeatedly calling the EMBOSS merger tool in a consecutive manner. Output includes the number of trimmed nucleotides, details of each merge, and an optional alignment to a reference sequence. The final merge sequence is displayed and can be downloaded in FASTA format. All output files can be downloaded as a ZIP archive. This tool allows for easy and automated assembly of overlapping sequences and is aimed at researchers without specialist computer skills. The tool is genome-and organism-agnostic and has been developed using hepatitis B virus sequence data.