TREAT - Two wRongs makE A righT: efficient distributed storage and queries of loT datasets with erasure coding and compression

被引:0
作者
Taurone, Francesco [1 ]
Feher, Marcell [2 ]
Sipos, Marton [2 ]
Lucani, Daniel E. [1 ]
机构
[1] Aarhus Univ, Aarhus, Denmark
[2] Chocolate Cloud ApS, Aarhus, Denmark
来源
PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED AND EVENT-BASED SYSTEMS, DEBS 2024 | 2024年
关键词
Erasure coding; Time series; Distributed Storage; Query; Compression; Generalized Deduplication; IoT; RLNC; INFORMATION; CODES;
D O I
10.1145/3629104.3666039
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Erasure coding in distributed multi-cloud data storage increases availability, durability and security, but it also makes data analytics inefficient since the whole dataset must be reconstructed to answer a query, even if the result set is a small fraction of the complete file. Data compression has a similar trade-off as it can reduce storage costs while requiring the entire compressed data to be collected and decompressed in order to access even a few bytes. We propose TREAT, a novel method that combines erasure coding and compression to achieve efficient queries of time series datasets while keeping the benefits of both underlying techniques. Our evaluation of five real-life datasets shows that it can answer range queries up to 25 times faster with 100 times less data transfer than reconstructing the whole dataset.
引用
收藏
页码:147 / 158
页数:12
相关论文
共 40 条
[31]  
Szymonacedaski Supratim Deb, 2005, 1 WORKSH NETW COD TH
[32]   Generalized Deduplication: Lossless Compression by Clustering Similar Data [J].
Talasila, Prasad ;
Lucani, Daniel E. .
PROCEEDING OF THE 2019 IEEE 8TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (CLOUDNET), 2019,
[33]  
Tatwawadi K, 2018, IEEE INT SYMP INFO, P891, DOI 10.1109/ISIT.2018.8437931
[34]   Hive - A Warehousing Solution Over a Map-Reduce Framework [J].
Thusoo, Ashish ;
Sen Sarma, Joydeep ;
Jain, Namit ;
Shao, Zheng ;
Chakka, Prasad ;
Anthony, Suresh ;
Liu, Hao ;
Wyckoff, Pete ;
Murthy, Raghotham .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02) :1626-1629
[35]  
Vavilapalli V. K., 2013, P 4 ANN S CLOUD COMP
[36]   Enabling Random Access in Universal Compressors [J].
Vestergaard, Rasmus ;
Zhang, Qi ;
Lucani, Daniel E. .
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM WKSHPS 2021), 2021,
[37]   Titchy: Online Time-Series Compression With Random Access for the Internet of Things [J].
Vestergaard, Rasmus ;
Zhang, Qi ;
Sipos, Marton ;
Lucani, Daniel E. .
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (24) :17568-17583
[38]   Understanding users' willingness to put their personal information on the personal cloud-based storage applications: An empirical study [J].
Widjaja, Andree E. ;
Chen, Jengchung Victor ;
Sukoco, Badri Munir ;
Ha, Quang-An .
COMPUTERS IN HUMAN BEHAVIOR, 2019, 91 :167-185
[39]  
Xia Mingyuan, 2015, Proceedings of the 13th USENIX Conference on File and Storage Technologies. FAST '15, P213
[40]   CBase-EC: Achieving Optimal Throughput-Storage Efficiency Trade-Off Using Erasure Codes [J].
Xiao, Chuqiao ;
Xia, Yefeng ;
Zhang, Qian ;
Gong, Xueqing ;
Zhu, Liyan .
ELECTRONICS, 2021, 10 (02) :1-16