A parallel hash-based method for local sequence alignment

被引:2
作者
Esmat, Aghaee-Meybodi [1 ]
Amin, Nezarat [2 ]
Sima, Emadi [1 ]
Reza, Ghaffari Mohammad [3 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Yazd Branch, Yazd, Iran
[2] Masaryk Univ, Inst Comp Sci, Brno, Czech Republic
[3] Agr Res Educ & Extens Org, Dept Syst Biol, Agr Biotechnol Res Inst Iran, Tehran, Iran
关键词
DNA sequencing; hash table; local alignment; sequence alignment; string matching; READ ALIGNMENT; SEARCH; ACID;
D O I
10.1002/cpe.6568
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Algorithms utilize an index-based aligning strategy, like a hash table, which typically entails the seed-and-extend method and is a time-consuming task. Here, we developed a hash-based search algorithm based on the SSAHA method without the use of seed-and-extend to conduct search and alignment faster than previous methods with multiple processors. In the proposed method by using the overlapping method in query and reference sequences, the accuracy and sensitivity increased. Further, the speed also increased by creating a hash table for the reference sequence when it was placed in the memory. Furthermore, by evaluating three datasets of different sequences in size and volumes, the effect of the created piece lengths as well as the effect of multiple processors on each dataset was evaluated indicating not only appeasing the time issue in alignment but also improving the mapping speed compared to the BLAST and SSAHA algorithms.
引用
收藏
页数:16
相关论文
共 27 条
[21]   NextGenMap: fast and accurate read mapping in highly polymorphic genomes [J].
Sedlazeck, Fritz J. ;
Rescheneder, Philipp ;
von Haeseler, Arndt .
BIOINFORMATICS, 2013, 29 (21) :2790-2791
[22]   Using quality scores and longer reads improves accuracy of Solexa read mapping [J].
Smith, Andrew D. ;
Xuan, Zhenyu ;
Zhang, Michael Q. .
BMC BIOINFORMATICS, 2008, 9 (1)
[23]   Updates to the RMAP short-read mapping software [J].
Smith, Andrew D. ;
Chung, Wen-Yu ;
Hodges, Emily ;
Kendall, Jude ;
Hannon, Greg ;
Hicks, James ;
Xuan, Zhenyu ;
Zhang, Michael Q. .
BIOINFORMATICS, 2009, 25 (21) :2841-2842
[24]  
Toh SH., 2009, PAPER PRESENTED P 11
[25]   RAPID SIMILARITY SEARCHES OF NUCLEIC-ACID AND PROTEIN DATA BANKS [J].
WILBUR, WJ ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1983, 80 (03) :726-730
[26]   Bitpacking techniques for indexing genomes: I. Hash tables [J].
Wu, Thomas D. .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2016, 11
[27]  
Zaharia M., 2011, FASTER MORE ACCURATE