Pattern matching with wildcards and length constraints using maximum network flow

被引:3
作者
Arslan, Abdullah N. [1 ]
He, Dan [2 ]
He, Yu
Wu, Xindong [3 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Informat Syst, Commerce, TX 75428 USA
[2] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[3] Univ Vermont, Dept Comp Sci, Burlington, VT 05401 USA
基金
美国国家科学基金会;
关键词
Pattern matching with wildcards; String algorithm; Maximum flow; Maximal vertex-disjoint paths; Maximal edge-disjoint paths;
D O I
10.1016/j.jda.2015.08.003
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In 2006 Chen et al. introduced an interesting text pattern matching problem with unique features. A pattern is described by a sequence of letters (literals) separated by a number of wildcards. For each pair of consecutive literals in the pattern description, a local length constraint specifies the minimum and maximum number of wildcards. Also, the pattern must occur in the text within a window for which the minimum and maximum lengths are specified by the global length constraint. Text letters are exclusively assigned to occurrences (the one-off condition). The objective is to find the maximum number of occurrences of the pattern in the text with all the constraints. For the problem (in this paper we call it Problem 1), there is a polynomial time online algorithm which does not guarantee an optimal solution. There are also polynomial time heuristic algorithms for the problem, and for some of its restricted cases. It is not known if Problem 1 is NP-hard. In this work, for two special cases of Problem 1, we give reductions to the maximum flow problem which can be solved in polynomial time. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 17 条
[1]  
[Anonymous], 1970, SOVIET MATH DOKL
[2]   Efficient string matching with wildcards and length constraints [J].
Chen, Gong ;
Wu, Xindong ;
Zhu, Xingquan ;
Arslan, Abdullah N. ;
He, Yu .
KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 10 (04) :399-419
[3]  
Even S., 1975, SIAM Journal on Computing, V4, P507, DOI 10.1137/0204043
[4]  
Ford L., 1962, FLOWS NETWORKS
[5]  
Haiping Wang, 2010, Proceedings of the 2010 IEEE International Conference on Granular Computing (GrC-2010), P782, DOI 10.1109/GrC.2010.156
[6]  
He D, 2007, PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, P1199
[7]   THE COMPLEXITY OF FINDING MAXIMUM DISJOINT PATHS WITH LENGTH CONSTRAINTS [J].
ITAI, A ;
PERL, Y ;
SHILOACH, Y .
NETWORKS, 1982, 12 (03) :277-286
[8]  
Karzanov A. V, 1974, SOV MATH DOKL, V15, P434
[9]   Improved inapproximability results for MaxClique, chromatic number and approximate graph coloring [J].
Khot, S .
42ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2001, :600-609
[10]  
Leeuven J.V., 1990, HDB THEORETICAL COMP, V1, P561