Hybrid deduplication system with content-based cache for cloud environment

被引:1
作者
Godavari, Amdewar [1 ]
Sudhakar, Chapram [2 ]
Ramesh, T. [2 ]
机构
[1] Kakatiya Inst Technol & Sci, Dept Comp Sci & Engn Networks, Warangal, India
[2] Natl Inst Technol Warangal, Dept Comp Sci & Engn, Warangal 506004, India
关键词
Deduplication; Content based cache; Disk bottleneck;
D O I
10.1016/j.jksuci.2024.102030
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Primary storage deduplication systems are performance sensitive. Their performance depends upon two factors - metadata access for duplicate detection and strategy for elimination of duplicate data. Various approaches for duplicate detection through suitable caching mechanisms have been proposed in the literature. Most of the approaches assumed that the primary workloads exhibit strong temporal locality. Whereas, this cannot be assumed in the context of Cloud as the workloads locality does not exist with interferences among different workloads on the same system. Duplicate content among the data blocks with different addresses lead to an inefficient utilization of the data cache. In this context, applying deduplication causes sharing of the data blocks among the clients with different access patterns and frequencies. In this situation, LRU cache, which considers only the recency of the references, is not appropriate. In this paper, Hybrid Deduplication System (HDS) containing the content-based cache with a new replacement policy - Modified Adaptive Replacement Cache (ARC), is proposed. The proposed system is simulated in the Linux environment using three different types of FIU traces. Effectiveness of the system is compared with a full deduplication system. Experimental results show that the system has performed consistently better than the full deduplication system in reducing the metadata overhead for all of the three data sets.
引用
收藏
页数:12
相关论文
共 25 条
[1]  
Abusaimeh H., 2017, J. Theor. Appl. Inf. Technol., V95
[2]  
[Anonymous], 2010, FIU traces web-link.
[3]   A Duplication-Aware SSD-Based Cache Architecture for Primary Storage in Virtualization Environment [J].
Chen, Xian ;
Chen, Wenzhi ;
Lu, Zhongyong ;
Long, Peng ;
Yang, Shuiqiao ;
Wang, Zonghui .
IEEE SYSTEMS JOURNAL, 2017, 11 (04) :2578-2589
[4]  
Cheng Li, 2014, Proceedings of USENIX ATC '14: 2014 USENIX Annual Technical Conference. ATC '14, P501
[5]  
Debnath B.K., 2010, USENIX ANN TECHN C U, P1
[6]   Hybrid Deduplication System-A Block-Level Similarity-Based Approach [J].
Godavari, Amdewar ;
Sudhakar, Chapram ;
Ramesh, T. .
IEEE SYSTEMS JOURNAL, 2021, 15 (03) :3860-3870
[7]   I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance [J].
Koller, Ricardo ;
Rangaswami, Raju .
ACM TRANSACTIONS ON STORAGE, 2010, 6 (03)
[8]  
Li WJ, 2016, 14TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '16), P301
[9]   Leach: an automatic learning cache for inline primary deduplication system [J].
Lin, Bin ;
Li, Shanshan ;
Liao, Xiangke ;
Zhang, Jing ;
Liu, Xiaodong .
FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (02) :175-183
[10]  
Liu J, 2014, IEEE S MASS STOR SYS