TREAT - Two wRongs makE A righT: efficient distributed storage and queries of loT datasets with erasure coding and compression

被引:0
作者
Taurone, Francesco [1 ]
Feher, Marcell [2 ]
Sipos, Marton [2 ]
Lucani, Daniel E. [1 ]
机构
[1] Aarhus Univ, Aarhus, Denmark
[2] Chocolate Cloud ApS, Aarhus, Denmark
来源
PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED AND EVENT-BASED SYSTEMS, DEBS 2024 | 2024年
关键词
Erasure coding; Time series; Distributed Storage; Query; Compression; Generalized Deduplication; IoT; RLNC; INFORMATION; CODES;
D O I
10.1145/3629104.3666039
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Erasure coding in distributed multi-cloud data storage increases availability, durability and security, but it also makes data analytics inefficient since the whole dataset must be reconstructed to answer a query, even if the result set is a small fraction of the complete file. Data compression has a similar trade-off as it can reduce storage costs while requiring the entire compressed data to be collected and decompressed in order to access even a few bytes. We propose TREAT, a novel method that combines erasure coding and compression to achieve efficient queries of time series datasets while keeping the benefits of both underlying techniques. Our evaluation of five real-life datasets shows that it can answer range queries up to 25 times faster with 100 times less data transfer than reconstructing the whole dataset.
引用
收藏
页码:147 / 158
页数:12
相关论文
共 40 条
[1]  
Aarhus Kommune, 2017, IoT Sensordata
[2]  
[Anonymous], 2010, HotCloud
[3]   Spark SQL: Relational Data Processing in Spark [J].
Armbrust, Michael ;
Xin, Reynold S. ;
Lian, Cheng ;
Huai, Yin ;
Liu, Davies ;
Bradley, Joseph K. ;
Meng, Xiangrui ;
Kaftan, Tomer ;
Franklint, Michael J. ;
Ghodsi, Ali ;
Zaharia, Matei .
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :1383-1394
[4]  
AWS, 2023, Framework
[5]  
Berkeley Research Lab, 2004, Intel Berkeley Research Lab Sensor Data
[6]  
Borthakur Dhruba, 2010, Document on Hadoop Wiki
[7]  
Chambers B., 2018, Spark: the definitive guide
[8]  
ChocolateCloud, 2024, Storage Location Speeds
[9]   Network Coding-Based Post-Quantum Cryptography [J].
Cohen, Alejandro ;
D'Oliveira, Rafael G. L. ;
Salamatian, Salman ;
Medard, Muriel .
IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY, 2021, 2 (01) :49-64
[10]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137