The state of the art and taxonomy of big data analytics: view from new big data framework

被引:148
作者
Mohamed, Azlinah [1 ]
Najafabadi, Maryam Khanian [2 ]
Wah, Yap Bee [1 ]
Zaman, Ezzatul Akmal Kamaru [1 ]
Maskat, Ruhaila [1 ]
机构
[1] Univ Teknol MARA, Adv Analyt Engn Ctr, Fac Comp & Math Sci, Shah Alam, Selangor, Malaysia
[2] INTI Int Univ & Coll, Fac Informat Technol, Nilai, Negeri Sembilan, Malaysia
关键词
Parallel and distributed computing; Big data tools; Big data analytics techniques; Domain area; MAP REDUCE SOLUTION; REAL-TIME; ATTRIBUTE REDUCTION; HIGH-PERFORMANCE; DECISION-MAKING; STREAMING DATA; EFFICIENT; MAPREDUCE; CLASSIFICATION; ARCHITECTURE;
D O I
10.1007/s10462-019-09685-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big data has become a significant research area due to the birth of enormous data generated from various sources like social media, internet of things and multimedia applications. Big data has played critical role in many decision makings and forecasting domains such as recommendation systems, business analysis, healthcare, web display advertising, clinicians, transportation, fraud detection and tourism marketing. The rapid development of various big data tools such as Hadoop, Storm, Spark, Flink, Kafka and Pig in research and industrial communities has allowed the huge number of data to be distributed, communicated and processed. Big data applications use big data analytics techniques to efficiently analyze large amounts of data. However, choosing the suitable big data tools based on batch and stream data processing and analytics techniques for development a big data system are difficult due to the challenges in processing and applying big data. Practitioners and researchers who are developing big data systems have inadequate information about the current technology and requirement concerning the big data platform. Hence, the strengths and weaknesses of big data technologies and effective solutions for Big Data challenges are needed to be discussed. Hence, due to that, this paper presents a review of the literature that analyzes the use of big data tools and big data analytics techniques in areas like health and medical care, social networking and internet, government and public sector, natural resource management, economic and business sector. The goals of this paper are to (1) understand the trend of big data-related research and current frames of big data technologies; (2) identify trends in the use or research of big data tools based on batch and stream processing and big data analytics techniques; (3) assist and provide new researchers and practitioners to place new research activity in this domain appropriately. The findings of this study will provide insights and knowledge on the existing big data platforms and their application domains, the advantages and disadvantages of big data tools, big data analytics techniques and their use, and new research opportunities in future development of big data systems.
引用
收藏
页码:989 / 1037
页数:49
相关论文
共 96 条
[1]   Big data for Natural Language Processing: A streaming approach [J].
Agerri, Rodrigo ;
Artola, Xabier ;
Beloki, Zuhaitz ;
Rigau, German ;
Soroa, Aitor .
KNOWLEDGE-BASED SYSTEMS, 2015, 79 :36-42
[2]  
Ahmad A, 2017, FUTURE GENER COMPUT
[3]   An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication [J].
Ahmad, Awais ;
Paul, Anand ;
Rathore, M. Mazhar .
NEUROCOMPUTING, 2016, 174 :439-453
[4]  
Amato F., 2017, FUTURE GENER COMPUT
[5]  
[Anonymous], FUTURE GENER COMPUT
[6]  
[Anonymous], APPL SOFT COMPUT
[7]  
[Anonymous], APPL SOFT COMPUT
[8]   A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data [J].
Apiletti, Daniele ;
Baralis, Elena ;
Cerquitelli, Tania ;
Garza, Paolo ;
Pulvirenti, Fabio ;
Michiardi, Pietro .
BIG DATA RESEARCH, 2017, 10 :53-69
[9]   Learning distributed discrete Bayesian Network Classifiers under MapReduce with Apache Spark [J].
Arias, Jacinto ;
Gamez, Jose A. ;
Puerta, Jose M. .
KNOWLEDGE-BASED SYSTEMS, 2017, 117 :16-26
[10]   From Business Intelligence to semantic data stream management [J].
Aufaure, Marie-Aude ;
Chiky, Raja ;
Cure, Olivier ;
Khrouf, Houda ;
Kepeklian, Gabriel .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 63 :100-107