MAPSKEW: Metaheuristic Approaches for Partitioning Skew in MapReduce

被引:2
作者
Pericini, Matheus H. M. [1 ]
Leite, Lucas G. M. [1 ]
de Carvalho-Junior, Francisco H. [1 ]
Machado, Javam C. [1 ]
Rezende, Cenez A. [2 ]
机构
[1] Univ Fed Ceara, Dept Pos Grad Ciencias Comp MDCC, BR-60020181 Fortaleza, CE, Brazil
[2] Univ Fortaleza UNIFOR, CCT, BR-60020181 Fortaleza, CE, Brazil
关键词
MapReduce; partitioning skew; metaheuristics; parallel computing; high performance computing;
D O I
10.3390/a12010005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
MapReduce is a parallel computing model in which a large dataset is split into smaller parts and executed on multiple machines. Due to its simplicity, MapReduce has been widely used in various applications domains. MapReduce can significantly reduce the processing time of a large amount of data by dividing the dataset into smaller parts and processing them in parallel in multiple machines. However, when data are not uniformly distributed, we have the so called partitioning skew, where the allocation of tasks to machines becomes unbalanced, either by the distribution function splitting the dataset unevenly or because a part of the data is more complex and requires greater computational effort. To solve this problem, we propose an approach based on metaheuristics. For evaluating purposes, three metaheuristics were implemented: Simulated Annealing, Local Beam Search and Stochastic Beam Search. Our experimental evaluation, using a MapReduce implementation of the Bron-Kerbosch Clique Algorithm, shows that the proposed method can find good partitionings while better balancing data among machines.
引用
收藏
页数:14
相关论文
共 20 条
[1]  
[Anonymous], 2014, P 11 INT C AUT COMP
[2]  
[Anonymous], 2011, P 1 INT C CLOUD COMP
[3]  
Atta F., 2011, 2011 Proceedings of the IEEE 14th International Multitopic Conference (INMIC 2011), P170, DOI 10.1109/INMIC.2011.6151466
[4]   FINDING ALL CLIQUES OF AN UNDIRECTED GRAPH [H] [J].
BRON, C ;
KERBOSCH, J .
COMMUNICATIONS OF THE ACM, 1973, 16 (09) :575-577
[5]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[6]   Mars: A MapReduce Framework on Graphics Processors [J].
He, Bingsheng ;
Fang, Wenbin ;
Luo, Qiong ;
Govindaraju, Naga K. ;
Wang, Tuyong .
PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :260-269
[7]  
Ibrahim S., 2010, Proceedings of the 2010 IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom 2010), P17, DOI 10.1109/CloudCom.2010.25
[8]   Handling partitioning skew in MapReduce using LEEN [J].
Ibrahim, Shadi ;
Jin, Hai ;
Lu, Lu ;
He, Bingsheng ;
Antoniu, Gabriel ;
Wu, Song .
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2013, 6 (04) :409-424
[9]  
Isard M., 2007, Operating Systems Review, V41, P59, DOI 10.1145/1272998.1273005
[10]  
Kumar V.S., 2017, CLUSTER COMPUT, P1