Efficient Lossy Compression for Scientific Data Based on Pointwise Relative Error Bound

被引:19
作者
Di, Sheng [1 ]
Tao, Dingwen [2 ]
Liang, Xin [3 ]
Cappello, Franck [1 ]
机构
[1] Argonne Natl Lab, Math & Comp Sci MCS Div, 9700 S Cass Ave, Argonne, IL 60439 USA
[2] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
[3] Univ Calif Riverside, Comp Sci Dept, Riverside, CA 92521 USA
关键词
Lossy compression; science data; high performance computing; relative error bound;
D O I
10.1109/TPDS.2018.2859932
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
An effective data compressor is becoming increasingly critical to today's scientific research, and many lossy compressors are developed in the context of absolute error bounds. Based on physical/chemical definitions of simulation fields or multiresolution demand, however, many scientific applications need to compress the data with a pointwise relative error bound (i.e., the smaller the data value, the smaller the compression error to tolerate). To this end, we propose two optimized lossy compression strategies under a state-of-the-art three-staged compression framework (prediction + quantization + entropy-encoding). The first strategy (called block-based strategy) splits the data set into many small blocks and computes an absolute error bound for each block, so it is particularly suitable for the data with relatively high consecutiveness in space. The second strategy (called multi-threshold-based strategy) splits the whole value range into multiple groups with exponentially increasing thresholds and performs the compression in each group separately, which is particularly suitable for the data with a relatively large value range and spiky value changes. We implement the two strategies rigorously and evaluate them comprehensively by using two scientific applications which both require lossy compression with point-wise relative error bound. Experiments show that the two strategies exhibit the best compression qualities on different types of data sets respectively. The compression ratio of our lossy compressor is higher than that of other state-of-the-art compressors by 17.2-618 percent on the climate simulation data and 30-210 percent on the N-body simulation data, with the same relative error bound and without degradation of the overall visualization effect of the entire data.
引用
收藏
页码:331 / 345
页数:15
相关论文
共 23 条
[1]  
[Anonymous], 1999, ICPADS
[2]  
[Anonymous], [No title captured]
[3]  
Baker A.H., 2014, P 23 INT S HIGH PERF, P203, DOI DOI 10.1145/2600212.2600217
[4]  
Burtscher M, 2007, IEEE DATA COMPR CONF, P293
[5]   Adaptive wavelet thresholding for image denoising and compression [J].
Chang, SG ;
Yu, B ;
Vetterli, M .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2000, 9 (09) :1532-1546
[6]   NUMARCK: Machine Learning Algorithm for Resiliency and Checkpointing [J].
Chen, Zhengzhang ;
Son, Seung Woo ;
Hendrix, William ;
Agrawal, Ankit ;
Liao, Wei-keng ;
Choudhary, Alok .
SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :733-744
[7]   A recursive thresholding technique for image segmentation [J].
Cheriet, M ;
Said, JN ;
Suen, CY .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1998, 7 (06) :918-921
[8]   Optimization of Error-Bounded Lossy Compression for Hard-to-Compress HPC Data [J].
Di, Sheng ;
Cappello, Franck .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (01) :129-143
[9]   Fast Error-bounded Lossy HPC Data Compression with SZ [J].
Di, Sheng ;
Cappello, Franck .
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, :730-739
[10]  
HABIB S, 2013, P INT C HIGH PERF CO, V6, P1, DOI DOI 10.1145/2503210.2504566