Distributed computing with the cloud

被引：3

作者：

Afek, Yehuda ^{[1
]}

Giladi, Gal ^{[1
]}

Patt-Shamir, Boaz ^{[2
]}

机构：

[1] Tel Aviv Univ, Sch CS, IL-6997801 Tel Aviv, Israel

[2] Tel Aviv Univ, Sch EE, IL-6997801 Tel Aviv, Israel

来源：

DISTRIBUTED COMPUTING | 2024年 / 37卷 / 01期

关键词：

MODEL; FLOWS;

D O I：

10.1007/s00446-024-00460-w

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We investigate the effect of omnipresent cloud storage on distributed computing. To this end, we specify a network model with links of prescribed bandwidth that connect standard processing nodes, and, in addition, passive storage nodes. Each passive node represents a cloud storage system, such as Dropbox, Google Drive etc. We study a few tasks in this model, assuming a single cloud node connected to all other nodes, which are connected to each other arbitrarily. We give implementations for basic tasks of collaboratively writing to and reading from the cloud, and for more advanced applications such as matrix multiplication and federated learning. Our results show that utilizing node-cloud links as well as node-node links can considerably speed up computations, compared to the case where processors communicate either only through the cloud or only through the network links. We first show how to optimally read and write large files to and from the cloud in general graphs using flow techniques. We use these primitives to derive algorithms for combining, where every processor node has an input value and the task is to compute a combined value under some given associative operator. In the special but common case of "fat links," where we assume that links between processors are bidirectional and have high bandwidth, we provide near-optimal algorithms for any commutative combining operator (such as vector addition). For the task of matrix multiplication (or other non-commutative combining operators), where the inputs are ordered, we present tight results in the simple "wheel" network, where procesing nodes are arranged in a ring, and are all connected to a single cloud node.

引用

页码：1 / 18

页数：18

共 38 条

[1] Modeling parallel bandwidth: Local versus global restrictions [J].

Adler, M ;

Gibbons, PB ;

Matias, Y ;

Ramachandran, V .

ALGORITHMICA, 1999, 24 (3-4) :381-404

[2]

Afek Y., 1988, Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, P90, DOI 10.1145/62546.62564

[3] Passing Messages while Sharing Memory [J].

Aguilera, Marcos K. ;

Ben-David, Naama ;

Calciu, Irina ;

Guerraoui, Rachid ;

Petrank, Erez ;

Toueg, Sam .

PODC'18: PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2018, :51-60

[4]

Anagnostides I., 2021, LIPICS, V209

[5]

Attiya H., 1998, Distributed Algorithms

[6]

Augustine J, 2020, PROCEEDINGS OF THE THIRTY-FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA'20), P1280

[7]

Baumann N, 2006, ANN IEEE SYMP FOUND, P399

[8]

Bolosky WJ, 2000, USENIX ASSOCIATION PROCEEDINGS OF THE 4TH UNSENIX WINDOWS SYSTEMS SYMPOSIUM, P13

[9] Practical Secure Aggregation for Privacy-Preserving Machine Learning [J].

Bonawitz, Keith ;

Ivanov, Vladimir ;

Kreuter, Ben ;

Marcedone, Antonio ;

McMahan, H. Brendan ;

Patel, Sarvar ;

Ramage, Daniel ;

Segal, Aaron ;

Seth, Karn .

CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :1175-1191

[10]

Burkard R. E., 1993, ZOR, Methods and Models of Operations Research, V37, P31, DOI 10.1007/BF01415527

← 1 2 3 4 →