Efficient CGM-based parallel algorithms for the longest common subsequence problem with multiple substring-exclusion constraints

被引:7
作者
Tchendji, Vianney Kengne [1 ]
Ngomade, Armel Nkonjoh [1 ]
Zeutouo, Jerry Lacmou [1 ]
Myoupo, Jean Frederic [2 ]
机构
[1] Univ Dschang, Dept Math & Comp Sci, Dschang, Cameroon
[2] Univ Picardie Jules Verne, Comp Sci Lab MIS, Amiens, France
关键词
Parallel algorithms; Coarse grained multicomputer; Dynamic programming; Multiple-constrained LCS; Direct acyclic graph;
D O I
10.1016/j.parco.2019.102598
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A variant of the Longest Common Subsequence (LCS) problem is the LCS problem with multiple substring-exclusion constraints (M-STR-EC-LCS), which has great importance in many fields especially in bioinformatics. This problem consists to compute the LCS of two strings X and Y of length n and m respectively that excluded a set of d constraints P = {P-1, P-2, ..., P-d) of total length r. Recently, Wang et al. proposed a sequential solution based on the dynamic programming technique that requires O(nmr) execution time and space. To the best of our knowledge, there is no parallel solutions for this problem. This paper describes new efficient parallel algorithms on Coarse Grained Multicomputer model (CGM) to solve this problem. Firstly, we propose a multi-level Direct Acyclic Graph (DAG) that determines the correct evaluation order of sub-problems in order to avoid redundancy due to overlap. Secondly, we propose two CGM parallel algorithms based on our DAG. The first algorithm is based on a regular partitioning of the DAG and requires O(nmr/p) execution time with O(p) communication rounds where p is the number of processors used. Its main drawback is high idleness time of processors because due to the dependencies between the nodes in the DAG, over time it has many idle processors. The second algorithm uses an irregular partitioning of the DAG that minimizes this idleness time by allowing the processors to stay active as long as possible. It requires O(nmr/p) execution time with O(kp) communication rounds. k is a constant integer allowing to setup the irregular partitioning. The both algorithms require O(r vertical bar Sigma vertical bar/p) preprocessing time where vertical bar Sigma vertical bar is the length of the alphabet. The experimental results performed show a good agreement with theoretical predictions. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 26 条
  • [1] EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH
    AHO, AV
    CORASICK, MJ
    [J]. COMMUNICATIONS OF THE ACM, 1975, 18 (06) : 333 - 340
  • [2] A coarse-grained parallel algorithm for the all-substrings longest common subsequence problem
    Alves, Carlos E. R.
    Caceres, Edson N.
    Song, Siang Wun
    [J]. ALGORITHMICA, 2006, 45 (03) : 301 - 335
  • [3] Efficient algorithms for the block-edit problems
    Ann, Hsing-Yen
    Yang, Chang-Biau
    Peng, Yung-Hsing
    Liaw, Bern-Cherng
    [J]. INFORMATION AND COMPUTATION, 2010, 208 (03) : 221 - 229
  • [4] [Anonymous], 2007, Algorithms on Strings, DOI DOI 10.1017/CBO9780511546853
  • [5] [Anonymous], [No title captured]
  • [6] [Anonymous], 1997, ACM SIGACT NEWS
  • [7] [Anonymous], 1985, NATO ISI Series
  • [8] THE LONGEST COMMON SUBSEQUENCE PROBLEM REVISITED
    APOSTOLICO, A
    GUERRA, C
    [J]. ALGORITHMICA, 1987, 2 (03) : 315 - 336
  • [9] Cheatham T., 1995, Proceedings of the Twenty-Eighth Hawaii International Conference on System Sciences, P268, DOI 10.1109/HICSS.1995.375451
  • [10] On the generalized constrained longest common subsequence problems
    Chen, Yi-Ching
    Chao, Kun-Mao
    [J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2011, 21 (03) : 383 - 392