Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory

被引:6
|
作者
Rosas, Claudia [1 ]
Sikora, Anna [1 ]
Jorba, Josep [2 ]
Moreno, Andreu [3 ]
Cesar, Eduardo [1 ]
机构
[1] Univ Autonoma Barcelona, Bellaterra 08193, Spain
[2] Univ Oberta Catalunya, Barcelona 08018, Spain
[3] Escola Univ Salesiana Sarria, Barcelona 08017, Spain
关键词
Load balancing; Data-intensive; Divisible Load Theory; Performance improvement;
D O I
10.1007/s10766-012-0199-4
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data-intensive applications are those that explore, query, analyze, and, in general, process very large data sets. Generally, these applications can be naturally implemented in parallel but, in many cases, these implementations show severe performance problems mainly due to load imbalances, inefficient use of available resources, and improper data partition policies. It is worth noticing that the problem becomes more complex when the conditions causing these problems change at run time. This paper proposes a methodology for dynamically improving the performance of certain data-intensive applications based on: adapting the size and number of data partitions, and the number of processing nodes, to the current application conditions in homogeneous clusters. To this end, the processing of each exploration is monitored and gathered data is used to dynamically tune the performance of the application. The tuning parameters included in the methodology are: (i) the partition factor of the data set, (ii) the distribution of the data chunks, and (iii) the number of processing nodes to be used. The methodology assumes that a single execution includes multiple related explorations on the same partitioned data set, and that data chunks are ordered according to their processing times during the application execution to assign first the most time consuming partitions. The methodology has been validated using the well-known bioinformatics tool-BLAST-and through extensive experimentation using simulation. Reported results are encouraging in terms of reducing total execution time of the application (up to a 40 % in some cases).
引用
收藏
页码:94 / 118
页数:25
相关论文
共 50 条
  • [21] Improving performance of parallel transaction processing systems by balancing data load on line
    Wang, JH
    Miyazaki, M
    Kameda, H
    Li, J
    SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2000, : 331 - 338
  • [22] CLOUD BASED RESOURCE SCHEDULING METHODOLOGY FOR DATA-INTENSIVE SMART CITIES AND INDUSTRIAL APPLICATIONS
    Ma, Shiming
    Chen, Jichang
    Zhang, Yang
    Shrivastava, Anand
    Mohan, Hari
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2021, 22 (02): : 227 - 235
  • [23] Improving the energy efficiency of data-intensive applications running on clusters
    Liu, Weifeng
    Zhou, Jie
    Gong, Bin
    Dai, Hongjun
    Guo, Meng
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2020, 35 (03) : 246 - 259
  • [24] ExoApp: Performance Evaluation of Data-Intensive Applications on ExoGENI
    Yu, Ze
    Liu, Xinxin
    Li, Min
    Liu, Kaikai
    Li, Xiaolin
    2013 SECOND GENI RESEARCH AND EDUCATIONAL EXPERIMENT WORKSHOP (GREE), 2013, : 25 - 28
  • [25] Improving performance of a dynamic load balancing system by using number of effective tasks
    Choi, M
    Yu, JL
    Kim, HJ
    Maeng, SR
    IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2003, : 436 - 441
  • [26] Performance Prediction for Families of Data-Intensive Software Applications
    Verriet, Jacques
    Dankers, Reinier
    Somers, Lou
    COMPANION OF THE 2018 ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING (ICPE '18), 2018, : 189 - 194
  • [27] Load balancing and service discovery using Docker Swarm for microservice based big data applications
    Singh, Neelam
    Hamid, Yasir
    Juneja, Sapna
    Srivastava, Gautam
    Dhiman, Gaurav
    Gadekallu, Thippa Reddy
    Shah, Mohd Asif
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [28] Load balancing and service discovery using Docker Swarm for microservice based big data applications
    Neelam Singh
    Yasir Hamid
    Sapna Juneja
    Gautam Srivastava
    Gaurav Dhiman
    Thippa Reddy Gadekallu
    Mohd Asif Shah
    Journal of Cloud Computing, 12
  • [29] Improving the performance of load balancing in software-defined networks through load variance-based synchronization
    Guo, Zehua
    Su, Mu
    Xu, Yang
    Duan, Zhemin
    Wang, Luo
    Hui, Shufeng
    Chao, H. Jonathan
    COMPUTER NETWORKS, 2014, 68 : 95 - 109
  • [30] Load balancing in homogeneous pipeline based applications
    Moreno, A.
    Cesar, E.
    Guevara, A.
    Sorribes, J.
    Margalef, T.
    PARALLEL COMPUTING, 2012, 38 (03) : 125 - 139