A Space-Bounded Anytime Algorithm for the Multiple Longest Common Subsequence Problem

被引:23
作者
Yang, Jiaoyun [1 ,2 ]
Xu, Yun [1 ,2 ]
Shang, Yi [3 ]
Chen, Guoliang [1 ,2 ]
机构
[1] Univ Sci & Technol China, Key Lab High Performance Comp, Hefei, Anhui, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci, Hefei, Anhui, Peoples R China
[3] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
基金
中国国家自然科学基金;
关键词
Heuristic search; anytime algorithm; space bounded; multiple longest common subsequence (MLCS); HEURISTIC-SEARCH; BEAM SEARCH;
D O I
10.1109/TKDE.2014.2304464
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multiple longest common subsequence (MLCS) problem, related to the identification of sequence similarity, is an important problem in many fields. As an NP-hard problem, its exact algorithms have difficulty in handling large-scale data and time-and space-efficient algorithms are required in real-world applications. To deal with time constraints, anytime algorithms have been proposed to generate good solutions with a reasonable time. However, there exists little work on space-efficient MLCS algorithms. In this paper, we formulate the MLCS problem into a graph search problem and present two space-efficient anytime MLCS algorithms, SA-MLCS and SLA-MLCS. SA-MLCS uses an iterative beam widening search strategy to reduce space usage during the iterative process of finding better solutions. Based on SA-MLCS, SLA-MLCS, a space-bounded algorithm, is developed to avoid space usage from exceeding available memory. SLA-MLCS uses a replacing strategy when SA-MLCS reaches a given space bound. Experimental results show SA-MLCS and SLA-MLCS use an order of magnitude less space and time than the state-of-the-art approximate algorithm MLCS-APP while finding better solutions. Compared to the state-of-the-art anytime algorithm Pro-MLCS, SA-MLCS and SLA-MLCS can solve an order of magnitude larger size instances. Furthermore, SLA-MLCS can find much better solutions than SA-MLCS on large size instances.
引用
收藏
页码:2599 / 2609
页数:11
相关论文
共 45 条
[1]  
Aine S, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2250
[2]  
[Anonymous], 2005, ICAPS
[3]   FAST LINEAR-SPACE COMPUTATIONS OF LONGEST COMMON SUBSEQUENCES [J].
APOSTOLICO, A ;
BROWNE, S ;
GUERRA, C .
THEORETICAL COMPUTER SCIENCE, 1992, 92 (01) :3-17
[4]   FINGERPRINTING G-PROTEIN-COUPLED RECEPTORS [J].
ATTWOOD, TK ;
FINDLAY, JBC .
PROTEIN ENGINEERING, 1994, 7 (02) :195-203
[5]   A survey of longest common subsequence algorithms [J].
Bergroth, L ;
Hakonen, H ;
Raita, T .
SPIRE 2000: SEVENTH INTERNATIONAL SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL - PROCEEDINGS, 2000, :39-48
[6]  
Blum C, 2007, LECT NOTES COMPUT SC, V4638, P150
[7]   Beam search for the longest common subsequence problem [J].
Blum, Christian ;
Blesa, Maria J. ;
Lopez-Ibanez, Manuel .
COMPUTERS & OPERATIONS RESEARCH, 2009, 36 (12) :3178-3186
[8]   HEURISTIC-SEARCH IN RESTRICTED MEMORY [J].
CHAKRABARTI, PP ;
GHOSE, S ;
ACHARYA, A ;
DESARKAR, SC .
ARTIFICIAL INTELLIGENCE, 1989, 41 (02) :197-221
[9]   A fast parallel algorithm for finding the longest common sequence of multiple biosequences [J].
Chen, Yixin ;
Wan, Andrew ;
Liu, Wei .
BMC BIOINFORMATICS, 2006, 7 (Suppl 4)
[10]  
FURCY D, 2006, P NAT C ART INT AAAI