BitPAl: a bit-parallel, general integer-scoring sequence alignment algorithm

被引:20
作者
Loving, Joshua [1 ,2 ]
Hernandez, Yozen [1 ,2 ]
Benson, Gary [1 ,2 ,3 ]
机构
[1] Boston Univ, Lab Biocomp & Informat, Boston, MA 02215 USA
[2] Boston Univ, Grad Program Bioinformat, Boston, MA 02215 USA
[3] Boston Univ, Dept Comp Sci, Boston, MA 02215 USA
基金
美国国家科学基金会;
关键词
VECTOR ALGORITHM;
D O I
10.1093/bioinformatics/btu507
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Mapping of high-throughput sequencing data and other bulk sequence comparison applications have motivated a search for high-efficiency sequence alignment algorithms. The bit-parallel approach represents individual cells in an alignment scoring matrix as bits in computer words and emulates the calculation of scores by a series of logic operations composed of AND, OR, XOR, complement, shift and addition. Bit-parallelism has been successfully applied to the longest common subsequence (LCS) and edit-distance problems, producing fast algorithms in practice. Results: We have developed BitPAl, a bit-parallel algorithm for general, integer-scoring global alignment. Integer-scoring schemes assign integer weights for match, mismatch and insertion/deletion. The BitPAl method uses structural properties in the relationship between adjacent scores in the scoring matrix to construct classes of efficient algorithms, each designed for a particular set of weights. In timed tests, we show that BitPAl runs 7-25 times faster than a standard iterative algorithm.
引用
收藏
页码:3166 / 3173
页数:8
相关论文
共 17 条
[1]   A BIT-STRING LONGEST-COMMON-SUBSEQUENCE ALGORITHM [J].
ALLISON, L ;
DIX, TI .
INFORMATION PROCESSING LETTERS, 1986, 23 (06) :305-310
[2]   A NEW APPROACH TO TEXT SEARCHING [J].
BAEZAYATES, R ;
GONNET, GH .
COMMUNICATIONS OF THE ACM, 1992, 35 (10) :74-82
[3]  
Benson G, 2013, LECT NOTES COMPUT SC, V7922, P50, DOI 10.1007/978-3-642-38905-4_7
[4]  
Bergeron A., 2002, International Journal of Foundations of Computer Science, V13, P53, DOI 10.1142/S0129054102000947
[5]   A fast and practical bit-vector algorithm for the Longest Common Subsequence problem [J].
Crochemore, M ;
Iliopoulos, CS ;
Pinzon, YJ ;
Reid, JF .
INFORMATION PROCESSING LETTERS, 2001, 80 (06) :279-285
[6]   VNTRseek--a computational tool to detect tandem repeat variants in high-throughput sequencing data [J].
Gelfand, Yevgeniy ;
Hernandez, Yozen ;
Loving, Joshua ;
Benson, Gary .
NUCLEIC ACIDS RESEARCH, 2014, 42 (14) :8884-8894
[7]   Bit-parallel witnesses and their applications to approximate string matching [J].
Hyyrö, H ;
Navarro, G .
ALGORITHMICA, 2005, 41 (03) :203-231
[8]  
HYYRO H, 2005, J EXPT ALGORITHMICS, V10, P2, DOI DOI 10.1145/1064546.1180617
[9]  
Hyyro H, 2004, P 15 AUSTR WORKSH CO
[10]   Bit-parallel computation of local similarity score matrices with unitary weights [J].
Hyyro, Heikki ;
Navarro, Gonzalo .
INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2006, 17 (06) :1325-1344