CareDedup: Cache-aware Deduplication for Reading Performance Optimization in Primary Storage

被引:37
作者
Lin, Bin [1 ]
Li, Shanshan [1 ]
Liao, Xiangke [1 ]
Liu, Xiaodong [1 ]
Zhang, Jing [2 ]
Jia, Zhouyang [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China
[2] Sci & Technol Complex Aviat Syst Simulat Lab, Beijing, Peoples R China
来源
2016 IEEE FIRST INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC 2016) | 2016年
关键词
Deduplication; Fragmentation; Cache;
D O I
10.1109/DSC.2016.56
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deduplication technology has been increasingly used to reduce the primary storage cost. In practice, it often causes additional on-disk fragmentation that impairs the reading performance. Existing deduplication algorithms mainly focus on the static data layout design so that the random I/O requests are largely avoided and the harmful effect can be alleviated. However, our trace-driven emulations show that, deduplication does not always impair the reading. It offers unique new opportunities for reading performance optimization by more possible cache hits. Motivated by this, we propose a novel cache-aware deduplication scheme CareDedup to well leverage the new opportunities. Based on a uniform locality assessment algorithm design, CareDedup selects the most profitable duplicated blocks to deduplicate for maximizing the reading performance. Our experimental evaluation using real-world traces shows that compared with the sequence-based deduplication algorithms, the duplicate elimination ratio and the reading performance (latency) can be both improved simultaneously. Given a desired duplicate elimination ratio, CareDedup can consistently outperforms sequence-based method by further reducing the reading latency by 2-5%.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 41 条
[21]  
Kaiser Jurgen., 2012, MASS STORAGE SYSTEMS, P1
[22]  
Koutoupis P., 2011, LINUX J, V2011
[23]  
Lagoudakis M. G, 1996, The 0-1 knapsack problem-an introductory survey
[24]  
Lin B., 2013, MIDDLEWAREDPT 13
[25]  
Lu M., 2012, SYSTOR 12
[26]  
Mandagere N., 2008, Proceedings of the ACM/IFIP/USENIX Middleware'08 Conference Companion, P12, DOI DOI 10.1145/1462735.1462739
[27]  
Meister D., 2012, SC 12
[28]  
Meister Dirk., 2010, MASS STORAGE SYSTEMS, P1
[29]  
Meyer D.T., 2011, Proceedings of the 9th USENIX conference on File and stroage technologies, P1
[30]  
Mkandawire S., 2012, IMPROVING BACKUP RES