Enhancing N-Gram-Hirschberg Algorithm by Using Hash Function

被引:3
作者
Abu-Hashem, Muhannad A. [1 ]
Rashid, Nur'Aini Abdul [1 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town, Malaysia
来源
2009 THIRD ASIA INTERNATIONAL CONFERENCE ON MODELLING & SIMULATION, VOLS 1 AND 2 | 2009年
关键词
SIMILARITY SEARCHES; PROTEIN;
D O I
10.1109/AMS.2009.112
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Dynamic programming-based algorithm such as Smith-Waterman algorithm, which produces the most optimal result, has been known as one of the most used algorithm for sequence alignment. Hirschberg algorithm is the space saving version of Smith-Waterman algorithm. However, both algorithms are still very computational intensive. The N-Gram-Hirschberg algorithm is introduced to further reduced the space requirement and at the same time, to speed up the sequences alignment algorithm. This research aims to enhance the N-Gram-Hirschberg algorithm by embedding the Hashing function, adopted from an exact string matching algorithm called Karp-Rabin. The hash function is used to enhance the transformation process for the algorithm. The new method improves the processing time of the N-Gram-Hirschberg without sacrificing the quality of the output. The best time enhancement we got was when word length is two for protein sequence length ranges between 100-1000.
引用
收藏
页码:282 / +
页数:2
相关论文
共 9 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
BOUKERCHEA A, 2006, PARALLEL STRATEGIES
[3]   LINEAR SPACE ALGORITHM FOR COMPUTING MAXIMAL COMMON SUBSEQUENCES [J].
HIRSCHBERG, DS .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :341-343
[4]   RAPID AND SENSITIVE PROTEIN SIMILARITY SEARCHES [J].
LIPMAN, DJ ;
PEARSON, WR .
SCIENCE, 1985, 227 (4693) :1435-1441
[5]  
LIU Y, 2006, UCRLCONF218814 DOE J
[6]  
OWOLABI O, 2002, EMPIRICAL STUDIES SO
[7]   COMPARISON OF METHODS FOR SEARCHING PROTEIN-SEQUENCE DATABASES [J].
PEARSON, WR .
PROTEIN SCIENCE, 1995, 4 (06) :1145-1160
[8]   IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON [J].
PEARSON, WR ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) :2444-2448
[9]   RAPID SIMILARITY SEARCHES OF NUCLEIC-ACID AND PROTEIN DATA BANKS [J].
WILBUR, WJ ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1983, 80 (03) :726-730