Performance Comparison of Distributed Pattern Matching Algorithms on Hadoop MapReduce Framework

被引:0
|
作者
Sona, C. P. [1 ]
Mulerikkal, Jaison Paul [1 ]
机构
[1] Rajagiri Sch Engn & Technol, Sunya Labs, Kochi, Kerala, India
来源
MOBILE NETWORKS AND MANAGEMENT (MONAMI 2017) | 2018年 / 235卷
关键词
Pattern matching; Hadoop; MapReduce; Big Data; Knuth Morris Pratt Algorithm; Boyer Moore Algorithm; Franek Jennings Smyth Algorithm;
D O I
10.1007/978-3-319-90775-8_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Creating meaning out of the growing Big Data is an insurmountable challenge data scientists face and pattern matching algorithms are great means to create such meaning from heaps of data. However, the available pattern matching algorithms are mostly tested with linear programming models whose adaptability and efficiency are not tested in distributed programming models such as Hadoop Map-Reduce, which supports Big Data. This paper explains an experience of parallelizing three of such pattern matching algorithms, namely - Knuth Morris Pratt Algorithm (KMP), Boyer Moore Algorithm (BM) and a lesser known Franek Jennings Smyth (FJS) Algorithm and porting them to Hadoop MapReduce framework. All the three algorithms are converted to MapReduce programs using key value pairs and experimented on single node as well as cluster Hadoop environment. The result analysis with the Project Gutenberg data-set has shown all the three parallel algorithms scale well on Hadoop environment as the data size increases. The experimental results prove that KMP algorithm gives higher performance for shorter patterns over BM, and BM algorithm gives higher performance than KMP for longer patterns. However, FJS algorithm, which is a hybrid of KMP and Boyer horspool algorithm which is advanced version of BM, outperforms both KMP and BM for shorter and longer patterns, and emerges as the most suitable algorithm for pattern matching in a Hadoop environment.
引用
收藏
页码:45 / 55
页数:11
相关论文
共 50 条
  • [41] A Framework for Distributed Pattern Matching Based on Multithreading
    Kofahi, Najib
    Abusalama, Ahmed
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2012, 9 (01) : 30 - 38
  • [42] GPU-In-Hadoop: Enabling MapReduce Across Distributed Heterogeneous Platforms
    Zhu, Jie
    Li, Juanjuan
    Hardesty, Erikson
    Jiang, Hai
    Li, Kuan-Ching
    2014 IEEE/ACIS 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2014, : 315 - 320
  • [43] A Survey on Parallel Join Algorithms Using MapReduce on Hadoop
    Barhoush, Malek Mahmoud
    AlSobeh, Anas Mohammad
    Al Rawashdeh, Ahmad
    2019 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT), 2019, : 381 - 388
  • [44] Pattern matching of signature-based ids using myers algorithm under mapreduce framework
    Aldwairi M.
    Abu-Dalo A.M.
    Jarrah M.
    EURASIP Journal on Information Security, 2017 (1)
  • [45] Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud
    Singh R.
    Kaur P.J.
    Journal of Big Data, 3 (1)
  • [46] Hadoop framework implementation and performance analysis on a cloud
    Ozen, Goksu Zekiye
    Tekerek, Mehmet
    Sultanov, Rayimbek
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (02) : 705 - 716
  • [47] Research on Power System Harmonic Detection based on Hadoop MapReduce Framework
    Chen Wenjuan
    Chen Shihua
    Wang Zheqiang
    Li Mengjie
    Zhou Yuan
    2018 CHINA INTERNATIONAL CONFERENCE ON ELECTRICITY DISTRIBUTION (CICED), 2018, : 431 - 435
  • [48] Apache Hadoop Based Distributed Denial of Service Detection Framework
    Patil, Nilesh Vishwasrao
    Krishna, C. Rama
    Kumar, Krishan
    INFORMATION, COMMUNICATION AND COMPUTING TECHNOLOGY (ICICCT 2019), 2019, 1025 : 25 - 35
  • [49] Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce
    Lu, Lu
    Jin, Hai
    Shi, Xuanhua
    Fedak, Gilles
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 76 - 84
  • [50] Performance Analysis of Graph Based Iterative Algorithms on MapReduce Framework
    Debbarma, Akashdeep
    Annappa, B.
    Mude, Ravi G.
    2014 INTERNATIONAL CONFERENCE FOR CONVERGENCE OF TECHNOLOGY (I2CT), 2014,