BIG DATA PROCESSING: BIG CHALLENGES AND OPPORTUNITIES

被引:27
作者
Ji, Changqing [1 ,2 ]
Li, Yu [3 ]
Qiu, Wenming [3 ]
Jin, Yingwei [4 ]
Xu, Yujie [1 ]
Awada, Uchechukwu [3 ]
Li, Keqiu [3 ]
Qu, Wenyu [1 ]
机构
[1] Dalian Maritime Univ, Coll Informat Sci & Technol, Dalian 116026, Peoples R China
[2] Dalian Univ, Coll Phys Sci & Technol, Dalian 116622, Peoples R China
[3] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Peoples R China
[4] Dalian Univ Technol, Sch Management, Dalian 116024, Peoples R China
基金
美国国家科学基金会;
关键词
Big data; cloud computing; data management; distributed processing;
D O I
10.1142/S0219265912500090
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the rapid growth of emerging applications like social network, semantic web, sensor networks and LBS (Location Based Service) applications, a variety of data to be processed continues to witness a quick increase. Effective management and processing of large-scale data poses an interesting but critical challenge. Recently, big data has attracted a lot of attention from academia, industry as well as government. This paper introduces several big data processing techniques from system and application aspects. First, from the view of cloud data management and big data processing mechanisms, we present the key issues of big data processing, including definition of big data, big data management platform, big data service models, distributed file system, data storage, data virtualization platform and distributed applications. Following the MapReduce parallel processing framework, we introduce some MapReduce optimization strategies reported in the literature. Finally, we discuss the open issues and challenges, and deeply explore the research directions in the future on big data processing in cloud computing environments.
引用
收藏
页数:19
相关论文
共 64 条
[1]  
Abouzeid A, 2009, P VLDB, V2, P922
[2]  
Akdogan A., 2010, Proceedings of the 2010 IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom 2010), P9, DOI 10.1109/CloudCom.2010.92
[3]  
Borthakur D, 2007, HADOOP PROJECT WEBSI, V11
[4]  
Bu YY, 2010, PROC VLDB ENDOW, V3, P285
[5]  
Cao Y, 2011, PROC INT CONF DATA, P291, DOI 10.1109/ICDE.2011.5767881
[6]  
Chang F., 2006, OSDI, P305
[7]  
Changqing Ji, 2012, 2012 Seventh ChinaGrid Annual Conference (ChinaGrid 2012), P25, DOI 10.1109/ChinaGrid.2012.19
[8]  
Chih Yang H., 2007, P 2007 ACM SIGMOD IN, P1029, DOI DOI 10.1145/1247480.1247602
[9]  
CONDIE T., 2010, SIGMOD, P1115, DOI DOI 10.1145/1807167.1807295
[10]  
Condie T, 2010, NSDI, P21