Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory

被引:6
|
作者
Rosas, Claudia [1 ]
Sikora, Anna [1 ]
Jorba, Josep [2 ]
Moreno, Andreu [3 ]
Cesar, Eduardo [1 ]
机构
[1] Univ Autonoma Barcelona, Bellaterra 08193, Spain
[2] Univ Oberta Catalunya, Barcelona 08018, Spain
[3] Escola Univ Salesiana Sarria, Barcelona 08017, Spain
关键词
Load balancing; Data-intensive; Divisible Load Theory; Performance improvement;
D O I
10.1007/s10766-012-0199-4
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data-intensive applications are those that explore, query, analyze, and, in general, process very large data sets. Generally, these applications can be naturally implemented in parallel but, in many cases, these implementations show severe performance problems mainly due to load imbalances, inefficient use of available resources, and improper data partition policies. It is worth noticing that the problem becomes more complex when the conditions causing these problems change at run time. This paper proposes a methodology for dynamically improving the performance of certain data-intensive applications based on: adapting the size and number of data partitions, and the number of processing nodes, to the current application conditions in homogeneous clusters. To this end, the processing of each exploration is monitored and gathered data is used to dynamically tune the performance of the application. The tuning parameters included in the methodology are: (i) the partition factor of the data set, (ii) the distribution of the data chunks, and (iii) the number of processing nodes to be used. The methodology assumes that a single execution includes multiple related explorations on the same partitioned data set, and that data chunks are ordered according to their processing times during the application execution to assign first the most time consuming partitions. The methodology has been validated using the well-known bioinformatics tool-BLAST-and through extensive experimentation using simulation. Reported results are encouraging in terms of reducing total execution time of the application (up to a 40 % in some cases).
引用
收藏
页码:94 / 118
页数:25
相关论文
共 50 条
  • [1] Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory
    Claudia Rosas
    Anna Sikora
    Josep Jorba
    Andreu Moreno
    Eduardo César
    International Journal of Parallel Programming, 2014, 42 : 94 - 118
  • [2] Algorithms for Divisible Load Scheduling of Data-intensive Applications
    Yu, Chen
    Marinescu, Dan C.
    JOURNAL OF GRID COMPUTING, 2010, 8 (01) : 133 - 155
  • [3] Algorithms for Divisible Load Scheduling of Data-intensive Applications
    Chen Yu
    Dan C. Marinescu
    Journal of Grid Computing, 2010, 8 : 133 - 155
  • [4] Supporting Load Balancing For Distributed Data-Intensive Applications
    Glimcher, Leonid
    Ravi, Vignesh T.
    Agrawal, Gagan
    16TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), PROCEEDINGS, 2009, : 235 - 244
  • [5] Adaptive divisible load model for scheduling data-intensive grid applications
    Othman, M.
    Abdullah, M.
    Ibrahim, H.
    Subramaniam, S.
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 1, PROCEEDINGS, 2007, 4487 : 446 - +
  • [6] Estimation based load balancing algorithm for data-intensive heterogeneous Grid environments
    Shah, Ruchir
    Veeravalli, Bharadwaj
    Misra, Manoj
    HIGH PERFORMANCE COMPUTING - HIPC 2006, PROCEEDINGS, 2006, 4297 : 72 - +
  • [7] Improving Load Balance for Data-Intensive Computing on Cloud Platforms
    Dai, Wei
    Ibrahim, Ibrahim
    Bassiouni, Mostafa
    2016 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2016, : 140 - 145
  • [8] Similarity-based load adjustment for real-time data-intensive applications
    Ho, SJ
    Kuo, TW
    Mok, AK
    18TH IEEE REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 1997, : 144 - 153
  • [9] Survey on Divisible Load Theory and its Applications
    Shokripour, Amin
    Othman, Mohamed
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND ENGINEERING, PROCEEDINGS, 2009, : 300 - 304
  • [10] Improving effective bandwidth of networks on clusters using load balancing for communication-intensive applications
    Qin, X
    Jiang, H
    CONFERENCE PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE, 2005, : 27 - 34