Mixing Deduplication and Compression on Active Data Sets

被引:24
作者
Constantinescu, Cornel [1 ]
Glider, Joseph [1 ]
Chambliss, David [1 ]
机构
[1] IBM Almaden Res Ctr, San Jose, CA 95120 USA
来源
2011 DATA COMPRESSION CONFERENCE (DCC) | 2011年
关键词
D O I
10.1109/DCC.2011.46
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Many new storage systems provide some form of data reduction. We examine data reduction methods that might be suitable for primary storage systems serving active data (as contrasted with backup and archive systems), by analysis of file sets found in different active data environments. We address questions of: how effective are compression and variations of deduplication, both separately and in combination; when deduplication and compression are combined, which should be applied first; what will the tradeoff be between the different methods in their use of MIPS relative to the data reduction achieved; and what degree of data reduction should be expected for different data types.
引用
收藏
页码:393 / 402
页数:10
相关论文
共 4 条
[1]  
BOLOSKY WJ, 2000, P 4 USENIX WIND SYST
[2]  
Constantinescu C., 2009, P SOC PHOTO-OPT INS, V7444, P1
[3]  
Quinlan S, 2002, USENIX ASSOCIATION PROCEEDINGS OF THE FAST'02 CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, P89
[4]  
RABIN MO, 1981, TR1581 HARV U CTR RE