OPTIMIZING HADOOP DATA LOCALITY: PERFORMANCE ENHANCEMENT STRATEGIES IN HETEROGENEOUS COMPUTING ENVIRONMENTS

被引:0
|
作者
Kim, Si-Yeong [1 ]
Kim, Tai-Hoon [1 ]
机构
[1] Chonnam Natl Univ, Sch Elect & Comp Engn, Yeosu Campus, Gwangju 59626, South Korea
来源
SCALABLE COMPUTING-PRACTICE AND EXPERIENCE | 2024年 / 25卷 / 06期
关键词
Hadoop; Data Locality; Performance Enhancement; Heterogeneous Computing; Distributed Computing; Big Data;
D O I
10.12694/scpe.v25i6.3294
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As organizations increasingly harness big data for analytics and decision-making, the efficient processing of massive datasets becomes paramount. Hadoop, a widely adopted distributed computing framework, excels in processing large-scale data. However, its performance is contingent on effective data locality, which becomes challenging in heterogeneous computing environments comprising diverse hardware resources. This research addresses the imperative of enhancing Hadoop's data locality performance in heterogeneous computing environments. The study explores strategies to optimize data placement and task scheduling, considering the diverse characteristics of nodes within the infrastructure. Through a comprehensive analysis of Hadoop's data locality algorithms and their impact on performance, this work proposes novel approaches to mitigate challenges associated with disparate hardware capabilities. Weighted Extreme Learning Machine Technique (Weighted ELM) with the Firefly Algorithm (WELM-FF) is used in the proposed work. The integration of Weighted Extreme Learning Machine (WELM) with the Firefly Algorithm holds promise for enhancing machine learning models in the context of large-scale data processing. The research employs a combination of theoretical analysis and practical experiments to evaluate the effectiveness of the proposed enhancements. Factors such as network latency, disk I/O, and CPU capabilities are taken into account to develop a holistic framework for improving data locality and, consequently, overall Hadoop performance. The findings presented in this study contribute valuable insights to the field of distributed computing, offering practical recommendations for organizations seeking to maximize the efficiency of their Hadoop deployments in heterogeneous computing environments. By addressing the intricacies of data locality, this research strives to enhance the scalability and performance of Hadoop clusters, thereby facilitating more effective utilization of big data resources.
引用
收藏
页码:4558 / 4575
页数:18
相关论文
共 27 条
  • [21] Efficient selection strategies towards processor reordering techniques for improving data locality in heterogeneous clusters
    Hsu, Ching-Hsien
    Chen, Shih-Chang
    JOURNAL OF SUPERCOMPUTING, 2012, 60 (03): : 284 - 300
  • [22] Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments
    Gu, Tao
    Zuo, Chuang
    Liao, Qun
    Yang, Yulu
    Li, Tao
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (05): : 71 - 81
  • [23] Efficient selection strategies towards processor reordering techniques for improving data locality in heterogeneous clusters
    Ching-Hsien Hsu
    Shih-Chang Chen
    The Journal of Supercomputing, 2012, 60 : 284 - 300
  • [24] An Energy-Aware High Performance Task Allocation Strategy in Heterogeneous Fog Computing Environments
    Gai, Keke
    Qin, Xiao
    Zhu, Liehuang
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (04) : 626 - 639
  • [25] Performance optimization of heterogeneous computing for large-scale dynamic graph data
    Wang, Haifeng
    Guo, Wenkang
    Zhang, Ming
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [26] Genetic Algorithm-Based Approach for Optimizing Query Performance in Big Data Environments
    Rabaaoui, Sana
    Aloui, Kamel
    Naceur, Mohamed Saber
    Barkaoui, Kamel
    2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND EMERGENT TECHNOLOGIES, ICASET 2024, 2024,
  • [27] Large-Scale Data Computing Performance Comparisons on SYCL Heterogeneous Parallel Processing Layer Implementations
    Shin, Woosuk
    Yoo, Kwan-Hee
    Baek, Nakhoon
    APPLIED SCIENCES-BASEL, 2020, 10 (05):