A new fast technique for pattern matching in biological sequences

被引:11
|
作者
Ibrahim, Osman Ali Sadek [1 ]
Hamed, Belal A. [1 ]
Abd El-Hafeez, Tarek [1 ,2 ]
机构
[1] Minia Univ, Dept Comp Sci, Fac Sci, El Minia, Egypt
[2] Deraya Univ, Comp Sci Unit, El Minia, Egypt
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 01期
关键词
Bioinformatics; Character comparison; Pattern matching; String Matching; DNA Sequences; ALGORITHM;
D O I
10.1007/s11227-022-04673-3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
At numerous phases of the computational process, pattern matching is essential. It enables users to search for specific DNA subsequences or DNA sequences in a database. In addition, some of these rapidly expanding biological databases are updated on a regular basis. Pattern searches can be improved by using high-speed pattern matching algorithms. Researchers are striving to improve solutions in numerous areas of computational bioinformatics as biological data grows exponentially. Faster algorithms with a low error rate are needed in real-world applications. As a result, this study offers two pattern matching algorithms that were created to help speed up DNA sequence pattern searches. The strategies recommended improve performance by utilizing word-level processing rather than character-level processing, which has been used in previous research studies. In terms of time cost, the proposed algorithms (EFLPM and EPAPM) increased performance by leveraging word-level processing with large pattern size. The experimental results show that the proposed methods are faster than other algorithms for short and long patterns. As a result, the EFLPM algorithm is 54% faster than the FLPM method, while the EPAPM algorithm is 39% faster than the PAPM method.
引用
收藏
页码:367 / 388
页数:22
相关论文
共 50 条
  • [1] A new fast technique for pattern matching in biological sequences
    Osman Ali Sadek Ibrahim
    Belal A. Hamed
    Tarek Abd El-Hafeez
    The Journal of Supercomputing, 2023, 79 : 367 - 388
  • [2] A fast exact pattern matching algorithm for biological sequences
    Huang, Yong
    Ping, Lingdi
    Pan, Xuezeng
    Cai, Guoyong
    BMEI 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOL 1, 2008, : 8 - +
  • [3] A Fast Hybrid Pattern Matching Algorithm for Biological Sequences
    Cai, Guoyong
    Nie, Xining
    Huang, Yong
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 468 - +
  • [4] A Fast Improved Pattern Matching Algorithm for Biological Sequences
    Huang, Yong
    Ping, Lingdi
    Pan, Xuezeng
    Jiang, Li
    Jiang, Xiaoning
    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 2, 2008, : 375 - 378
  • [5] TVSBS: A fast exact pattern matching algorithm for biological sequences
    Thathoo, Rahul
    Virmani, Ashish
    Lakshmi, S. Sai
    Balakrishnan, N.
    Sekar, K.
    CURRENT SCIENCE, 2006, 91 (01): : 47 - 53
  • [6] DETERMINISTIC SAMPLING - A NEW TECHNIQUE FOR FAST PATTERN-MATCHING
    VISHKIN, U
    SIAM JOURNAL ON COMPUTING, 1991, 20 (01) : 22 - 40
  • [7] A fast pattern matching algorithm for highly similar sequences
    Ben Nsira, Nadia
    Lecroq, Thierry
    Elloumi, Mourad
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [8] ESS : A fast algorithm for pattern matching in character sequences
    Ziegler, B.
    Informatik - Forschung und Entwicklung, 1996, 11 (02): : 69 - 83
  • [9] Simple and Efficient Pattern Matching Algorithms for Biological Sequences
    Neamatollahi, Peyman
    Hadi, Montassir
    Naghibzadeh, Mahmoud
    IEEE ACCESS, 2020, 8 (08): : 23838 - 23846
  • [10] Kangaroo - A pattern-matching program for biological sequences
    Betel, D
    Hogue, CWV
    BMC BIOINFORMATICS, 2002, 3 (1)