Energy-Efficient Analytics for Geographically Distributed Big Data

被引:1
作者
Zhao, Peng [1 ]
Yang, Xinyu [1 ]
Lin, Jie [1 ]
Yang, Shusen [2 ,3 ]
Yu, Wei [4 ]
机构
[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Inst Informat & Syst Sci, Xian, Shaanxi, Peoples R China
[3] Imperial Coll London, London, England
[4] Towson Univ, Dept Comp & Informat Sci, Towson, MD USA
关键词
Big data analytics; cost minimization; data centers; energy consumption; geographically data distribution;
D O I
10.1109/MIC.2019.2920584
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Big data analytics on geographically distributed datasets (across data centers or clusters) has been attracting increased interest in both academia and industry, posing significant complications for system and algorithm design. In this paper, we systematically investigate the geodistributed big data analytics framework by analyzing the fine-grained paradigm and key design principles. We present a dynamic global manager selection algorithm to minimize energy consumption cost by fully exploiting the system diversities in geography and variation over time. The algorithm makes real-time decisions based on measurable system parameters through stochastic optimization methods, while achieving performance balance between energy cost and latency. Extensive trace-driven simulations verify the effectiveness and efficiency of the proposed algorithm. We also highlight several potential research directions that remain open and require future elaborations in analyzing geodistributed big data.
引用
收藏
页码:18 / 29
页数:12
相关论文
共 13 条
[1]   A Survey on Geographically Distributed Big-Data Processing Using MapReduce [J].
Dolev, Shlomi ;
Florissi, Patricia ;
Gudes, Ehud ;
Sharma, Shantanu ;
Singer, Ido .
IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (01) :60-80
[2]  
Glerum K, 2009, SOSP'09: PROCEEDINGS OF THE TWENTY-SECOND ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, P103
[3]   A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems [J].
Hameed, Abdul ;
Khoshkbarforoushha, Alireza ;
Ranjan, Rajiv ;
Jayaraman, Prem Prakash ;
Kolodziej, Joanna ;
Balaji, Pavan ;
Zeadally, Sherali ;
Malluhi, Qutaibah Marwan ;
Tziritas, Nikos ;
Vishnu, Abhinav ;
Khan, Samee U. ;
Zomaya, Albert .
COMPUTING, 2016, 98 (07) :751-774
[4]   Inter-Datacenter Bulk Transfers with NetStitcher [J].
Laoutaris, Nikolaos ;
Sirivianos, Michael ;
Yang, Xiaoyuan ;
Rodriguez, Pablo .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2011, 41 (04) :74-85
[5]  
Lavalle S, 2011, MIT SLOAN MANAGE REV, V52, P21
[6]  
Mohan P., 2012, P 2012 ACM SIGMOD IN, P349, DOI [10.1145/2213836.2213876, DOI 10.1145/2213836.2213876]
[7]   Low Latency Geo-distributed Data Analytics [J].
Pu, Qifan ;
Ananthanarayanan, Ganesh ;
Bodik, Peter ;
Kandula, Srikanth ;
Akella, Aditya ;
Bahl, Paramvir ;
Stoica, Ion .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2015, 45 (04) :421-434
[8]  
Vulimiri Ashish, 2015, P 12 USENIX S NETW S, P323
[9]   Privacy-Preserving Multimedia Big Data Aggregation in Large-Scale Wireless Sensor Networks [J].
Wu, Dapeng ;
Yang, Boran ;
Wang, Honggang ;
Wang, Chonggang ;
Wang, Ruyan .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2016, 12 (04)
[10]   TR-Spark: Transient Computing for Big Data Analytics [J].
Yan, Ying ;
Gao, Yanjie ;
Chen, Yang ;
Guo, Zhongxin ;
Chen, Bole ;
Moscibroda, Thomas .
PROCEEDINGS OF THE SEVENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC 2016), 2016, :484-496