Performance Comparison of Distributed Pattern Matching Algorithms on Hadoop MapReduce Framework

被引:0
|
作者
Sona, C. P. [1 ]
Mulerikkal, Jaison Paul [1 ]
机构
[1] Rajagiri Sch Engn & Technol, Sunya Labs, Kochi, Kerala, India
来源
MOBILE NETWORKS AND MANAGEMENT (MONAMI 2017) | 2018年 / 235卷
关键词
Pattern matching; Hadoop; MapReduce; Big Data; Knuth Morris Pratt Algorithm; Boyer Moore Algorithm; Franek Jennings Smyth Algorithm;
D O I
10.1007/978-3-319-90775-8_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Creating meaning out of the growing Big Data is an insurmountable challenge data scientists face and pattern matching algorithms are great means to create such meaning from heaps of data. However, the available pattern matching algorithms are mostly tested with linear programming models whose adaptability and efficiency are not tested in distributed programming models such as Hadoop Map-Reduce, which supports Big Data. This paper explains an experience of parallelizing three of such pattern matching algorithms, namely - Knuth Morris Pratt Algorithm (KMP), Boyer Moore Algorithm (BM) and a lesser known Franek Jennings Smyth (FJS) Algorithm and porting them to Hadoop MapReduce framework. All the three algorithms are converted to MapReduce programs using key value pairs and experimented on single node as well as cluster Hadoop environment. The result analysis with the Project Gutenberg data-set has shown all the three parallel algorithms scale well on Hadoop environment as the data size increases. The experimental results prove that KMP algorithm gives higher performance for shorter patterns over BM, and BM algorithm gives higher performance than KMP for longer patterns. However, FJS algorithm, which is a hybrid of KMP and Boyer horspool algorithm which is advanced version of BM, outperforms both KMP and BM for shorter and longer patterns, and emerges as the most suitable algorithm for pattern matching in a Hadoop environment.
引用
收藏
页码:45 / 55
页数:11
相关论文
共 50 条
  • [1] Performance Enhancement of Hadoop MapReduce Framework for Analyzing BigData
    Prabhu, Swathi
    Rodrigues, Anisha P.
    Prasad, Guru M. S.
    Nagesh, H. R.
    2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [2] A Parallel Genetic Algorithms Framework based on Hadoop MapReduce
    Ferrucci, Filomena
    Salza, Pasquale
    Kechadi, M-Tahar
    Sarro, Federica
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1664 - 1667
  • [3] MapReduce scheduling algorithms in Hadoop: a systematic study
    Hedayati, Soudabeh
    Maleki, Neda
    Olsson, Tobias
    Ahlgren, Fredrik
    Seyednezhad, Mahdi
    Berahmand, Kamal
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [4] A REVIEW OF RESEARCH ON MAPREDUCE SCHEDULING ALGORITHMS IN HADOOP
    Singh, Namrata
    Agrawal, Sanjay
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 637 - 642
  • [5] Comparison and Improvement of Hadoop MapReduce Performance Prediction Models in the Private Cloud
    Wang, Nini
    Yang, Jian
    Lu, Zhihui
    Li, Xiaoyan
    Wu, Jie
    ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 77 - 91
  • [6] Multi-pattern Matching Algorithm Based on MapReduce and Hadoop
    Zhang, Wei
    Li, Baolu
    Li, Kun
    PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1856 - 1859
  • [7] Straggler Mitigation in Hadoop MapReduce Framework: A Review
    Ajibade, Lukuman Saheed
    Abu Bakar, Kamalrulnizam
    Aliyu, Ahmed
    Danish, Tasneem
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 870 - 878
  • [8] An Approach to Enhance the Performance of Hadoop MapReduce Framework for Big Data
    Chandra, Subhash
    Motwani, Deepak
    2016 INTERNATIONAL CONFERENCE ON MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING (ICMETE), 2016, : 178 - 182
  • [9] Hadoop-MapReduce Job Scheduling Algorithms Survey
    Mohamed, Ehab
    Hong, Zheng
    2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2016, : 237 - 242
  • [10] BitmapAligner: Bit-Parallelism String Matching with MapReduce and Hadoop
    Aksa, Mary
    Rashid, Junaid
    Nisar, Muhammad Wasif
    Mahmood, Toqeer
    Kwon, Hyuk-Yoon
    Hussain, Amir
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 3931 - 3946