Hadoop-Based Big Data Distributions: A Comparative Study

被引:2
作者
Hamdaoui, Ikram [1 ]
El Fissaoui, Mohamed [2 ]
El Makkaoui, Khalid [1 ]
El Allali, Zakaria [1 ]
机构
[1] Mohammed First Univ, FPD, MSC team, LaMAO Lab, Nador, Morocco
[2] Mohammed First Univ, FPD, LMASI Lab, Nador, Morocco
来源
EMERGING TRENDS IN INTELLIGENT SYSTEMS & NETWORK SECURITY | 2023年 / 147卷
关键词
Batch processing; Big data; Cloud computing; Hadoop distributions; Stream processing;
D O I
10.1007/978-3-031-15191-0_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Approximately 2.5 quintillion bytes of various forms (structured, semi- structured, or unstructured) of data are generated every day. Indeed, big data technology has come to solve the limitations of traditional methods, which can no longer handle and process large amounts of data in various forms. Hadoop is an open-source big data solution created to store, process, and manage a huge volume of different types of data. Many companies developed their own Hadoop distributions based on the Hadoop ecosystem in the last decade. This paper presents the most popular Hadoop distributions, including MapR, Hortonworks, Cloudera, IBM InfoSphere BigInsights, Amazon Elastic MapReduce, Azure HDInsights, Pivotal HD, and Qubole. Then it provides readers with a deep, detailed comparison of these distributions.
引用
收藏
页码:242 / 252
页数:11
相关论文
共 26 条
[1]  
Achari Shiva., 2015, Hadoop essentials
[2]   Big data in cybersecurity: a survey of applications and future trends [J].
Alani M.M. .
Journal of Reliable Intelligent Environments, 2021, 7 (02) :85-114
[3]   Programming big data analysis: principles and solutions [J].
Belcastro, Loris ;
Cantini, Riccardo ;
Marozzo, Fabrizio ;
Orsino, Alessio ;
Talia, Domenico ;
Trunfio, Paolo .
JOURNAL OF BIG DATA, 2022, 9 (01)
[4]  
Bell F., 2022, SNOWFLAKE ESSENTIALS, P1
[5]   The Snowflake Elastic Data Warehouse [J].
Dageville, Benoit ;
Cruanes, Thierry ;
Zukowski, Marcin ;
Antonov, Vadim ;
Avanes, Artin ;
Bock, Jon ;
Claybaugh, Jonathan ;
Engovatov, Daniel ;
Hentschel, Martin ;
Huang, Jiansheng ;
Lee, Allison W. ;
Motivala, Ashish ;
Munir, Abdul Q. ;
Pelley, Steven ;
Povinec, Peter ;
Rahn, Greg ;
Triantafyllis, Spyridon ;
Unterbrunner, Philipp .
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, :215-226
[6]  
Ebbers M., 2013, Implementing IBM InfoSphere BigInsights on IBM System X, VSecond
[7]   Cloud-ElGamal and Fast Cloud-RSA Homomorphic Schemes for Protecting Data Confidentiality in Cloud Computing [J].
El Makkaoui, Khalid ;
Beni-Hssane, Abderrahim ;
Ezzati, Abdellah .
INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2019, 11 (03) :90-102
[8]  
El Makkaoui K, 2016, 2016 2ND INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGIES AND APPLICATIONS (CLOUDTECH), P81, DOI 10.1109/CloudTech.2016.7847682
[9]   SUPERVISED MACHINE LEARNING: A SURVEY [J].
El Mrabet, Mohammed Amine ;
El Makkaoui, Khalid ;
Faize, Ahmed .
2021 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGIES AND NETWORKING (COMMNET'21), 2021, :127-136
[10]  
Gupta Yogesh Kumar, 2020, Proceedings of the 3rd International Conference on Intelligent Sustainable Systems (ICISS 2020), P471, DOI 10.1109/ICISS49785.2020.9315863