Classes of cost functions for string edit distance

被引:0
作者
S. V. Rice
H. Bunke
T. A. Nartker
机构
[1] University of Nevada,Information Science Research Institute
[2] Las Vegas,Department of Computer Science
[3] University of Bern,undefined
来源
Algorithmica | 1997年 / 18卷
关键词
String comparison; Edit distance; Cost function; Equivalence class; Longest common subsequence;
D O I
暂无
中图分类号
学科分类号
摘要
Finding a sequence of edit operations that transforms one string of symbols into another with the minimum cost is a well-known problem. The minimum cost, or edit distance, is a widely used measure of the similarity of two strings. An important parameter of this problem is the cost function, which specifies the cost of each insertion, deletion, and substitution. We show that cost functions having the same ratio of the sum of the insertion and deletion costs divided by the substitution cost yield the same minimum cost sequences of edit operations. This leads to a partitioning of the universe of cost functions into equivalence classes. Also, we show the relationship between a particular set of cost functions and the longest common subsequence of the input strings.
引用
收藏
页码:271 / 280
页数:9
相关论文
共 9 条
[1]  
Gusfield D.(1994)Parametric optimization of sequence alignment Algorithmica 12 312-326
[2]  
Balasubramanian K.(1995)Parametric string edit distance and its application to pattern recognition IEEE Trans. Systems Man Cybernet. 25 202-206
[3]  
Naor D.(1974)The string-to-string correction problem J. Assoc. Comput. Mach. 21 168-173
[4]  
Bunke H.(1985)Algorithms for approximate string matching Inform. and Control 64 100-118
[5]  
Csirik J.(1986)An Algorithmica 1 251-266
[6]  
Wagner R. A.(undefined) difference algorithm and its variations undefined undefined undefined-undefined
[7]  
Fischer M. J.(undefined)undefined undefined undefined undefined-undefined
[8]  
Ukkonen E.(undefined)undefined undefined undefined undefined-undefined
[9]  
Myers E. W.(undefined)undefined undefined undefined undefined-undefined