Clustering-based data placement in cloud computing: a predictive approach

被引:0
|
作者
Mokhtar Sellami
Haithem Mezni
Mohand Said Hacid
Mohamed Moshen Gammoudi
机构
[1] University of Jendouba,
[2] Taibah University,undefined
[3] SMART Lab,undefined
[4] ISG de Tunis,undefined
[5] Univ. Lyon,undefined
[6] University Claude Bernard Lyon 1,undefined
[7] LIRIS,undefined
[8] Higher Institute of Multimedia Arts of Manouba,undefined
[9] RIADI,undefined
来源
Cluster Computing | 2021年 / 24卷
关键词
Data placement; Resource usage; Intensive jobs; Prediction; Kernel Density Estimation; Fuzzy FCA; SOA; Autonomic computing;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, cloud computing environments have become a natural choice to host and process a huge volume of data. The combination of cloud computing and big data frameworks is an effective way to run data-intensive applications and tasks. Also, an optimal arrangement of data partitions can improve the tasks executions, which is not the case in most big data frameworks. For example, the default distribution of data partitions in Hadoop-based clouds causes several problems, which are mainly related to the load balancing and the resource usage. In addition, most existing data placement solutions are static and lack precision in the placement of data partitions. To overcome these issues, we propose a data placement approach based on the prediction of the future resources usage. We exploit Kernel Density Estimation (KDE) and Fuzzy FCA techniques to, first, forecast the workers’ and tasks’ future resource consumption and, second, cluster data partitions and intensive jobs according to the estimated resource usage. Fuzzy FCA is also used to exclude partitions and jobs that require less resources, which will reduce the needless migrations. To allow monitoring and predicting the workers’ states and the data partitions’ consumption, we modeled the big data cluster as an autonomic service-based system. The obtained results have shown that our solution outperformed existing approaches in terms of migrations rate and resource consumption.
引用
收藏
页码:3311 / 3336
页数:25
相关论文
共 50 条
  • [1] Clustering-based data placement in cloud computing: a predictive approach
    Sellami, Mokhtar
    Mezni, Haithem
    Hacid, Mohand Said
    Gammoudi, Mohamed Moshen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2021, 24 (04): : 3311 - 3336
  • [2] A data placement strategy based on clustering and consistent hashing algorithm in Cloud Computing
    Li, Qiang
    Wang, Kun
    Wei, Suwei
    Han, Xuefeng
    Xu, Lili
    Gao, Min
    2014 9TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2014, : 478 - 483
  • [3] Clustering-based approach for medical data classification
    Kodabagi, Mallikarjun M.
    Tikotikar, Ahelam
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14)
  • [4] Clustering-based and consistent hashing-aware data placement algorithm
    Chen T.
    Xiao N.
    Liu F.
    Fu C.-S.
    Ruan Jian Xue Bao/Journal of Software, 2010, 21 (12): : 3175 - 3185
  • [5] A data placement strategy for big data based on DCC in cloud computing systems
    Wang, Tao
    Yao, Shihong
    Xu, Zhengquan
    Jia, Shan
    Xu, Qiang
    2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 623 - 630
  • [6] A Data Placement Strategy Based on Genetic Algorithm in Cloud Computing Platform
    Guo, Wei
    Wang, Xinjun
    2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013), 2013, : 369 - 372
  • [7] Clustering-based visualizations for diagnosing diseases on metagenomic data
    Nguyen, Hai Thanh
    Phan, Trang Huyen
    Pham, Linh Thuy Thi
    Pham, Ngoc Huynh
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5685 - 5699
  • [8] Clustering-based undersampling in class-imbalanced data
    Lin, Wei-Chao
    Tsai, Chih-Fong
    Hu, Ya-Han
    Jhang, Jing-Shang
    INFORMATION SCIENCES, 2017, 409 : 17 - 26
  • [9] The Design and Evaluation of a Strategy of Data Placement in Cloud Computing Platform
    Guo, Wei
    Luo, Kaibo
    Wang, Xinjun
    Cui, Lizhen
    INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2014, 7 (01) : 13 - 30
  • [10] Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing
    Fan, Weifeng
    Peng, Jun
    Zhang, Xiaoyong
    Huang, Zhiwu
    ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 253 - 265