Review of Coded Computing

被引:0
作者
Zheng T. [1 ]
Zhou T. [1 ]
Cai Z. [1 ]
And W.H. [1 ]
机构
[1] College of Computer, National University of Defense Technology, Changsha
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2021年 / 58卷 / 10期
基金
中国国家自然科学基金;
关键词
Coded computing; Data privacy; Distributed computing; Distributed machine learning; Network coding; Performance optimizing; System security;
D O I
10.7544/issn1000-1239.2021.20210496
中图分类号
学科分类号
摘要
By integrating the coding theory with distributed computing and exploiting flexible coding methods, coded computing manages to relieve the transmission burden and the negative effects of stragglers. In this way, it improves the overall performance of distributed computing systems. Meanwhile, coded computing schemes are also designed and used to provide security and privacy guarantees for distributed computing systems, where mechanisms, such as error-correcting and data masking, are generally adopted. Due to the advantages of coded computing in communication, storage and computational complexity, it has attracted extensive attention and has become a popular direction in the field of distributed computing. In this survey, the background of coded computing is reviewed with its definition and core ideology clarified. Afterward, the existing coding schemes for communication bottleneck, computation delay and security privacy are introduced and comparatively analyzed in detail. Finally, future research directions and technical challenges of coded computing are analyzed and introduced to provide valuable references for related researchers. © 2021, Science Press. All right reserved.
引用
收藏
页码:2187 / 2212
页数:25
相关论文
共 102 条
[1]  
Li Songze, Maddah-Ali M A, Yu Qian, Et al., A fundamental tradeoff between computation and communication in distributed computing, IEEE Transactions on Information Theory, 64, 1, pp. 109-128, (2017)
[2]  
Lee K, Lam M, Pedarsani R, Et al., Speeding up distributed machine learning using codes, IEEE Transactions on Information Theory, 64, 3, pp. 1514-1529, (2018)
[3]  
Wang Yan, Li Nianshuang, Wang Xiling, Et al., Coding-based performance improvement of distributed machine learning in large-scale clusters, Journal of Computer Research and Development, 57, 3, pp. 542-561, (2020)
[4]  
Li Songze, Avestimehr S., Coded Computing: Mitigating Fundamental Bottlenecks in Large-scale Distributed Computing and Machine Learning, Foundations and Trends in Communications and Information Theory, 17, 1, pp. 1-148, (2020)
[5]  
Chowdhury M, Zaharia M, Ma J, Et al., Managing data transfers in computer clusters with orchestra, ACM Sigcomm Computer Communication Review, 41, 4, pp. 98-109, (2011)
[6]  
Zhang Zhuoyao, Cherkasova L, Loo B T., Performance modeling of mapreduce jobs in heterogeneous cloud environments, Proc of the 6th Int Conf on Cloud Computing, pp. 839-846, (2013)
[7]  
Yadwadkar N J, Hariharan B, Gonzalez J E, Et al., Multi-task learning for straggler avoiding predictive job scheduling, Journal of Machine Learning Research, 17, 1, pp. 3692-3728, (2016)
[8]  
Jeffrey D, Luiz A B., The tail at scale, Communications of the ACM, 56, 2, pp. 74-80, (2013)
[9]  
Li Songze, Yu Qian, Maddah-Ali M A, Et al., A Scalable framework for wireless distributed computing, IEEE/ACM Transactions on Networking, 25, 5, pp. 2643-2654, (2017)
[10]  
Li Songze, Maddah-Ali M A, Avestimehr A S., Coded distributed computing: straggling servers and multistage dataflows, Proc of the 54th Annual Allerton Conf on Communication, Control, and Computing, pp. 164-171, (2016)