Parallelized In-Network Aggregation for Failure Repair in Erasure-Coded Storage Systems

被引:7
作者
Xia, Junxu [1 ]
Luo, Lailong [1 ]
Sun, Bowen [1 ]
Cheng, Geyao [1 ]
Guo, Deke [2 ]
机构
[1] Natl Univ Def Technol, Sci & Technol Informat Syst Engn Lab, Changsha 410073, Hunan, Peoples R China
[2] Xiangjiang Lab, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Erasure code; distributed storage system; programmable switch; fault tolerance;
D O I
10.1109/TNET.2024.3367995
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To repair a failed block in the erasure-coded storage system, multiple related blocks have to be retrieved from other storage nodes across the network. Such a process can lead to significant incast-type repair traffics and delays. The existing efforts mainly try to schedule the transmission of the requested blocks across different storage nodes to avoid network congestion. At their cores, they utilize part of the involved hosts to rely on or aggregate the file blocks from others. While we notice that, the programmability and capability of today's network devices (i.e., routers and switches) bring a great opportunity to further speed up the repair progress by aggregating the file blocks with such devices. By mitigating the aggregation operations from the network edges to network cores, it is possible to save more time and bandwidth. With this intuition, we propose Paint, a parallelized in-network aggregation framework for failure repair. Paint utilizes programmable switches to aggregate relevant data and improves the repair performance by implementing multiple parallelized repair pipelines. We propose a series of novel and time-friendly algorithms to construct the routing paths for Paint and design the Aggregation Control Protocol to implement Paint in production clusters. For all we know, this is the first work to explore and implement parallelized in-network repair with programmable switches. The extensive experiments on the prototype system and real-world datasets indicate that Paint can significantly improve repair performance while effectively reducing bandwidth overhead.
引用
收藏
页码:2888 / 2903
页数:16
相关论文
共 50 条
[41]   Design and Evaluation of a Risk-Aware Failure Identification Scheme for Improved RAS in Erasure-Coded Data Centers [J].
Huang, Weichen ;
Fang, Juntao ;
Wan, Shenggang ;
Xie, Changsheng ;
He, Xubin .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (01) :16-30
[42]   Implicit Effect of Decoding Time on Fault Tolerance in Erasure Coded Cloud Storage Systems [J].
Safaei, Bardia ;
Miremadi, Seyed Ghassem ;
Chamazcoti, Saeideh Alinezhad .
2016 20TH INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), 2016,
[43]   Reliability of Erasure Coded Storage Systems: A Combinatorial-Geometric Approach [J].
Vaishampayan, Vinay A. ;
Campello, Antonio .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (11) :5795-5809
[44]   Single Failure Recovery Method for Erasure Coded Storage System with Heterogeneous Devices [J].
Fu, Yingxun ;
Guo, Junyi ;
Ma, Li ;
Duan, Jianyong .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (09) :1865-1869
[45]   Partial-Parallel-Repair (PPR) A Distributed Technique for Repairing Erasure Coded Storage [J].
Mitra, Subrata ;
Panta, Rajesh ;
Ra, Moo-Ryong ;
Bagchi, Saurabh .
PROCEEDINGS OF THE ELEVENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, (EUROSYS 2016), 2016,
[46]   Reconsidering Single Disk Failure Recovery for Erasure Coded Storage Systems: Optimizing Load Balancing in Stack-Level [J].
Fu, Yingxun ;
Shu, Jiwu ;
Shen, Zhirong ;
Zhang, Guangyan .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (05) :1457-1469
[47]   T-Update: A Tree-Structured Update Scheme with Top-Down Transmission in Erasure-Coded Systems [J].
Pei, Xiaoqiang ;
Wang, Yijie ;
Ma, Xingkong ;
Xu, Fangliang .
IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
[48]   EC-FRM: An Erasure Coding Framework to Speed up Reads for Erasure Coded Cloud Storage Systems [J].
Fu, Yingxun ;
Shu, Jiwu ;
Shen, Zhirong .
2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, :480-489
[49]   A Graph-Assisted Out-of-Place Update Scheme for Erasure Coded Storage Systems [J].
Deng, Haiwei ;
Jia, Ranhao ;
Wu, Chentao .
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
[50]   BPR: An Erasure Coding Batch Parallel Repair Approach in Distributed Storage Systems [J].
Song, Ying ;
Zhao, Wenxuan ;
Wang, Bo .
IEEE ACCESS, 2023, 11 :44509-44518