Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks

被引:21
|
作者
Hua, Xiayu [1 ]
Wu, Hao [1 ]
Li, Zheng [1 ]
Ren, Shangping [1 ]
机构
[1] IIT, Dept Comp Sci, Chicago, IL 60616 USA
基金
美国国家科学基金会; 美国国家航空航天局;
关键词
HDFS; Interaction intensive task; Cache; Hierarchical structure; PSO; Storage allocation algorithm;
D O I
10.1016/j.jpdc.2014.03.010
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Hadoop Distributed File System (HDFS) is designed to run on commodity hardware and can be used as a stand-alone general purpose distributed file system (Hdfs user guide, 2008). It provides the ability to access bulk data with high I/O throughput. As a result, this system is suitable for applications that have large I/O data sets. However, the performance of HDFS decreases dramatically when handling the operations of interaction-intensive files, i.e., files that have relatively small size but are frequently accessed. The paper analyzes the cause of throughput degradation issue when accessing interaction-intensive files and presents an enhanced HDFS architecture along with an associated storage allocation algorithm that overcomes the performance degradation problem. Experiments have shown that with the proposed architecture together with the associated storage allocation algorithm, the HDFS throughput for interaction-intensive files increases 300% on average with only a negligible performance decrease for large data set tasks. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:2770 / 2779
页数:10
相关论文
共 36 条
  • [21] A Study of Effective Replica Reconstruction Schemes for the Hadoop Distributed File System
    Higai, Asami
    Takefusa, Atsuko
    Nakada, Hidemoto
    Oguchi, Masato
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (04) : 872 - 882
  • [22] SD-HDFS: Secure Deletion in Hadoop Distributed File System
    Agrawal, Bikash
    Hansen, Raymond
    Rong, Chunming
    Wiktorski, Tomasz
    2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 181 - 189
  • [23] A Distributed and Cooperative NameNode Cluster for a Highly-Available Hadoop Distributed File System
    Kim, Yonghwan
    Araragi, Tadashi
    Nakamura, Junya
    Masuzawa, Toshimitsu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (04) : 835 - 851
  • [24] Enabling Top-N File Retrieval In Cloud Storage Using Hadoop Distributed File System
    Jeya, Jospin J.
    Kannan, E.
    2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM), 2016, : 168 - 171
  • [25] NDCouplingHDFS: A Coupling Architecture for a Power-Proportional Hadoop Distributed File System
    Hieu Hanh Le
    Hikida, Satoshi
    Yokota, Haruo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (02): : 213 - 222
  • [26] Enabling Prioritized Cloud I/O Service in Hadoop Distributed File System
    Yeh, Tsozen
    Sun, Yifeng
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 256 - 259
  • [27] Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability
    Gupta, Manish Kumar
    Dwivedi, Rajendra Kumar
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2023, 12 (01):
  • [28] LAST-HDFS: Location-Aware Storage Technique for Hadoop Distributed File System
    Liao, Cong
    Squicciarini, Anna
    Lin, Dan
    PROCEEDINGS OF 2016 IEEE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2016, : 662 - 669
  • [29] Stacked multi-layer security for Hadoop distributed file system using HSCT steganography
    Suganya, S.
    Selvamuthukumaran, S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21)
  • [30] CoS-HDFS: Co-Locating Geo-Distributed Spatial Data in Hadoop Distributed File System
    Fahmy, Mariam Malak
    Elghandour, Iman
    Nagi, Magdy
    2016 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT), 2016, : 123 - 132