Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks

被引:155
|
作者
Fernandez, Alberto [1 ]
del Rio, Sara [2 ]
Lopez, Victoria [2 ]
Bawakid, Abdullah [3 ]
del Jesus, Maria J. [1 ]
Benitez, Jose M. [2 ]
Herrera, Francisco [2 ,3 ]
机构
[1] Univ Jaen, Dept Comp Sci, Jaen, Spain
[2] Univ Granada, Dept Comp Sci & Artificial Intelligence, Granada, Spain
[3] King Abdulaziz Univ, Fac Comp & Informat Technol North Jeddah, Jeddah 21413, Saudi Arabia
关键词
DATA SCIENCE; MAP-REDUCE; PERFORMANCE; CLASSIFICATION; ALGORITHMS; CHALLENGES; TECHNOLOGIES; ASSOCIATION; SIMILARITY; DATABASES;
D O I
10.1002/widm.1134
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The term Big Data' has spread rapidly in the framework of Data Mining and Business Intelligence. This new scenario can be defined by means of those problems that cannot be effectively or efficiently addressed using the standard computing resources that we currently have. We must emphasize that Big Data does not just imply large volumes of data but also the necessity for scalability, i.e., to ensure a response in an acceptable elapsed time. When the scalability term is considered, usually traditional parallel-type solutions are contemplated, such as the Message Passing Interface or high performance and distributed Database Management Systems. Nowadays there is a new paradigm that has gained popularity over the latter due to the number of benefits it offers. This model is Cloud Computing, and among its main features we has to stress its elasticity in the use of computing resources and space, less management effort, and flexible costs. In this article, we provide an overview on the topic of Big Data, and how the current problem can be addressed from the perspective of Cloud Computing and its programming frameworks. In particular, we focus on those systems for large-scale analytics based on the MapReduce scheme and Hadoop, its open-source implementation. We identify several libraries and software projects that have been developed for aiding practitioners to address this new programming model. We also analyze the advantages and disadvantages of MapReduce, in contrast to the classical solutions in this field. Finally, we present a number of programming frameworks that have been proposed as an alternative to MapReduce, developed under the premise of solving the shortcomings of this model in certain scenarios and platforms. WIREs Data Mining Knowl Discov 2014, 4:380-409. doi: 10.1002/widm.1134 For further resources related to this article, please visit the . Conflict of interest: The authors have declared no conflicts of interest for this article.
引用
收藏
页码:380 / 409
页数:30
相关论文
共 50 条
  • [1] "Big" Data Management in Cloud Computing Environment
    Agarwal, Mohit
    Srivastava, Gur Mauj Saran
    HARMONY SEARCH AND NATURE INSPIRED OPTIMIZATION ALGORITHMS, 2019, 741 : 707 - 716
  • [2] Security by Design for Big Data Frameworks Over Cloud Computing
    Awaysheh, Feras M.
    Aladwan, Mohammad N.
    Alazab, Mamoun
    Alawadi, Sadi
    Cabaleiro, Jose C.
    Pena, Tomas F.
    IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2022, 69 (06) : 3676 - 3693
  • [3] Big Data Processing for Pervasive Environment in Cloud Computing
    Amato, Alba
    Di Martino, Beniamino
    Venticinque, Salvatore
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2014, : 598 - 603
  • [4] Application Of Cloud Computing In Biomedicine Big Data Analysis Cloud Computing In Big Data
    Yang, Tianyi
    Zhao, Yang
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [5] Cloud Computing and Big Data
    Hsu, Ching-Hsien
    Tang, Chunming
    Esteves, Rui M.
    JOURNAL OF INTERNET TECHNOLOGY, 2014, 15 (06): : 995 - 997
  • [6] Big data and cloud computing
    Shrestha, Rasu B.
    APPLIED RADIOLOGY, 2014, 43 (03) : 32 - 34
  • [7] Computation Partitioning for Mobile Cloud Computing in a Big Data Environment
    Li, Jianqiang
    Huang, Luxiang
    Zhou, Yaoming
    He, Suiqiang
    Ming, Zhong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (04) : 2009 - 2018
  • [8] A Critical Review of Cloud Computing Environment for Big Data Analytics
    Dzulhikam, Dzulaisar
    Rana, Muhammad Ehsan
    2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 76 - 81
  • [9] Computation Partitioning for Mobile Cloud Computing in a Big Data Environment
    Huang, Luxiang
    Li, Jianqiang
    Li, Jun
    Yi, Dongyi
    2017 2ND INTERNATIONAL CONFERENCE ON FRONTIERS OF SENSORS TECHNOLOGIES (ICFST), 2017, : 312 - 316
  • [10] Soft computing techniques for big data and cloud computing
    B. B. Gupta
    Dharma P. Agrawal
    Shingo Yamaguchi
    Michael Sheng
    Soft Computing, 2020, 24 : 5483 - 5484