Tight Bounds on the Round Complexity of the Distributed Maximum Coverage Problem
January 09, 2018 Β· Declared Dead Β· π ACM-SIAM Symposium on Discrete Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Sepehr Assadi, Sanjeev Khanna
arXiv ID
1801.02793
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.DC
Citations
16
Venue
ACM-SIAM Symposium on Discrete Algorithms
Last Checked
3 months ago
Abstract
We study the maximum $k$-set coverage problem in the following distributed setting. A collection of sets $S_1,\ldots,S_m$ over a universe $[n]$ is partitioned across $p$ machines and the goal is to find $k$ sets whose union covers the most number of elements. The computation proceeds in synchronous rounds. In each round, all machines simultaneously send a message to a central coordinator who then communicates back to all machines a summary to guide the computation for the next round. At the end, the coordinator outputs the answer. The main measures of efficiency in this setting are the approximation ratio of the returned solution, the communication cost of each machine, and the number of rounds of computation. Our main result is an asymptotically tight bound on the tradeoff between these measures for the distributed maximum coverage problem. We first show that any $r$-round protocol for this problem either incurs a communication cost of $ k \cdot m^{Ξ©(1/r)}$ or only achieves an approximation factor of $k^{Ξ©(1/r)}$. This implies that any protocol that simultaneously achieves good approximation ratio ($O(1)$ approximation) and good communication cost ($\widetilde{O}(n)$ communication per machine), essentially requires logarithmic (in $k$) number of rounds. We complement our lower bound result by showing that there exist an $r$-round protocol that achieves an $\frac{e}{e-1}$-approximation (essentially best possible) with a communication cost of $k \cdot m^{O(1/r)}$ as well as an $r$-round protocol that achieves a $k^{O(1/r)}$-approximation with only $\widetilde{O}(n)$ communication per each machine (essentially best possible). We further use our results in this distributed setting to obtain new bounds for the maximum coverage problem in two other main models of computation for massive datasets, namely, the dynamic streaming model and the MapReduce model.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted