EAFR: An Energy-Efficient Adaptive File Replication System in Data-Intensive Clusters

被引:14
|
作者
Lin, Yuhua [1 ]
Shen, Haiying [2 ]
机构
[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA
[2] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22904 USA
基金
美国国家科学基金会;
关键词
Data-intensive clusters; file replication; replica placement; energy-efficient; DATA CENTERS; MANAGEMENT; REDUCTION;
D O I
10.1109/TPDS.2016.2613989
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In data intensive clusters, a large amount of files are stored, processed and transferred simultaneously. To increase the data availability, some file systems create and store three replicas for each file in randomly selected servers across different racks. However, they neglect the file heterogeneity and server heterogeneity, which can be leveraged to further enhance data availability and file system efficiency. As files have heterogeneous popularities, a rigid number of three replicas may not provide immediate response to an excessive number of read requests to hot files, and waste resources (including energy) for replicas of cold files that have few read requests. Also, servers are heterogeneous in network bandwidth, hardware configuration and capacity (i. e., the maximal number of service requests that can be supported simultaneously), it is crucial to select replica servers to ensure low replication delay and request response delay. In this paper, we propose an Energy-Efficient Adaptive File Replication System (EAFR), which incorporates three components. It is adaptive to time-varying file popularities to achieve a good tradeoff between data availability and efficiency. Higher popularity of a file leads to more replicas and vice versa. Also, to achieve energy efficiency, servers are classified into hot servers and cold servers with different energy consumption, and cold files are stored in cold servers. EAFR then selects a server with sufficient capacity (including network bandwidth and capacity) to hold a replica. To further improve the performance of EAFR, we propose a dynamic transmission rate adjustment strategy to prevent potential incast congestion when replicating a file to a server, a networkaware data node selection strategy to reduce file read latency, and a load-aware replica maintenance strategy to quickly create file replicas under replica node failures. Experimental results on a real-world cluster show the effectiveness of EAFR and proposed strategies in reducing file read latency, replication time, and power consumption in large clusters.
引用
收藏
页码:1017 / 1030
页数:14
相关论文
共 50 条
  • [1] EAFR: An Energy-Efficient Adaptive File Replication System In Data-Intensive Clusters
    Lin, Yuhua
    Shen, Haiying
    24TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS ICCCN 2015, 2015,
  • [2] An Energy-Efficient and Reliable Storage Mechanism for Data-Intensive Academic Archive Systems
    Chen, Tseng-Yi
    Wei, Hsin-Wen
    Yeh, Tsung-Tai
    Hsu, Tsan-Sheng
    Shih, Wei-Kuan
    ACM TRANSACTIONS ON STORAGE, 2015, 11 (02)
  • [3] An Energy-Efficient Distributed File System
    Liu, Tzong-Jye
    Tseng, Wen-Chun
    2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 426 - 431
  • [4] Leveraging Endpoint Flexibility in Data-Intensive Clusters
    Chowdhury, Mosharaf
    Kandula, Srikanth
    Stoica, Ion
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2013, 43 (04) : 231 - 242
  • [5] A System for Energy-Efficient Data Management
    Tu, Yi-Cheng
    Wang, Xiaorui
    Zeng, Bo
    Xu, Zichen
    SIGMOD RECORD, 2014, 43 (01) : 21 - 26
  • [6] Is it time to revisit Erasure Coding in Data-intensive clusters?
    Darrous, Jad
    Ibrahim, Shadi
    Perez, Christian
    2019 IEEE 27TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2019), 2019, : 165 - 178
  • [7] Towards Energy-Efficient and Thermal-Aware Data Placement for Storage Clusters
    Li, Jie
    Deng, Yuhui
    Fan, Zhifeng
    Zhong, Zijie
    Min, Geyong
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2024, 9 (04): : 631 - 647
  • [8] Analysis of Optimal File Placement for Energy-Efficient File-Sharing Cloud Storage System
    Machida, Fumio
    Hasebe, Koji
    Abe, Hirotake
    Kato, Kazuhiko
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2022, 7 (01): : 75 - 86
  • [9] Adaptive Replica Management Model for Data-Intensive Application
    Tian, Tian
    Dong, Liu
    Yi, He
    INFORMATION COMPUTING AND APPLICATIONS, ICICA 2013, PT I, 2013, 391 : 150 - +
  • [10] EnE-Rep: An Energy-Efficient Data Replication Strategy for Clouds
    Alghobiri, Mohammed
    BALTIC JOURNAL OF MODERN COMPUTING, 2024, 12 (03): : 304 - 326