Algorithms for the longest common subsequence problem for multiple strings based on geometric maxima

被引:17
作者
Hakata, K
Imai, H
机构
[1] NEC Corp Ltd, Kawasaki, Kanagawa, Japan
[2] Univ Tokyo, Dept Informat Sci, Bunkyo Ku, Tokyo 1130033, Japan
关键词
longest common subsequence; dynamic programming; geometric maxima; computational geometry;
D O I
10.1080/10556789808805713
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Given two or more strings (for example, DNA and amino acid sequences), the longest common subsequence (LCS for short) problem is to determine the longest common subsequence obtained by deleting zero or more symbols from each string. The algorithms for computing an LCS between two strings were given by many papers, but there are few efficient algorithms for computing an LCS between more than two strings. This paper proposes a method for computing efficiently the LCS between three and more strings of small alphabet size, evaluates its theoretical time complexity, and estimates the computing time by computational experiments. Using this method, the LCS problem for eight strings of more than 120 length can be solved in about 40 min on a slow workstation.
引用
收藏
页码:233 / 260
页数:28
相关论文
共 29 条
[1]  
AHO AV, 1976, J ACM, V23, P1, DOI 10.1145/321921.321922
[2]   THE LONGEST COMMON SUBSEQUENCE PROBLEM REVISITED [J].
APOSTOLICO, A ;
GUERRA, C .
ALGORITHMICA, 1987, 2 (03) :315-336
[3]   THE ERDOS-RENYI LAW IN DISTRIBUTION, FOR COIN TOSSING AND SEQUENCE MATCHING [J].
ARRATIA, R ;
GORDON, L ;
WATERMAN, MS .
ANNALS OF STATISTICS, 1990, 18 (02) :539-570
[4]   CRITICAL PHENOMENA IN SEQUENCE MATCHING [J].
ARRATIA, R ;
WATERMAN, MS .
ANNALS OF PROBABILITY, 1985, 13 (04) :1236-1249
[5]   SEARCHING SUBSEQUENCES [J].
BAEZAYATES, RA .
THEORETICAL COMPUTER SCIENCE, 1991, 78 (02) :363-376
[6]   THE PARAMETERIZED COMPLEXITY OF SEQUENCE ALIGNMENT AND CONSENSUS [J].
BODLAENDER, HL ;
DOWNEY, RG ;
FELLOWS, MR ;
WAREHAM, HT .
THEORETICAL COMPUTER SCIENCE, 1995, 147 (1-2) :31-54
[7]   PERFORMANCE ANALYSIS OF SOME SIMPLE HEURISTICS FOR COMPUTING LONGEST COMMON SUBSEQUENCES [J].
CHIN, F ;
POON, CK .
ALGORITHMICA, 1994, 12 (4-5) :293-311
[8]  
DAYHOFF MO, 1969, SCI AM, V221, P87
[9]   COMPUTER AIDS TO PROTEIN SEQUENCE DETERMINATION [J].
DAYHOFF, MO .
JOURNAL OF THEORETICAL BIOLOGY, 1965, 8 (01) :97-&
[10]   COMPUTING LENGTH OF LONGEST INCREASING SUBSEQUENCES [J].
FREDMAN, ML .
DISCRETE MATHEMATICS, 1975, 11 (01) :29-35