$k$-Center Clustering with Outliers in the MPC and Streaming Model

February 24, 2023 Β· Declared Dead Β· πŸ› IEEE International Parallel and Distributed Processing Symposium

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Mark de Berg, Leyla Biabani, Morteza Monemizadeh arXiv ID 2302.12811 Category cs.DS: Data Structures & Algorithms Cross-listed cs.CG Citations 14 Venue IEEE International Parallel and Distributed Processing Symposium Last Checked 3 months ago
Abstract
Given a point set $P \subseteq X$ of size $n$ in a metric space $(X,dist)$ of doubling dimension $d$ and two parameters $k \in N$ and $z \in N$, the $k$-center problem with $z$ outliers asks to return a set $C^\ast \subseteq X$ of $k$ centers such that the maximum distance of all but $z$ points of $P$ to their nearest center in $C^\ast$ is minimized. An $(Ξ΅,k,z)$-coreset for this problem is a weighted point set $P^*$ such that an optimal solution for the $k$-center problem with $z$ outliers on $P^*$ gives a $(1\pmΞ΅)$-approximation for the $k$-center problem with $z$ outliers on $P$. We study the construction of such coresets in the Massively Parallel Computing (MPC) model, and in the insertion-only as well as the fully dynamic streaming model. We obtain the following results, for any given $0 < Ξ΅\le 1$: In all cases, the size of the computed coreset is $O(k/Ξ΅^d+z)$. - In the MPC model, we present a deterministic $2$-round and a randomized $1$-round algorithm. Additionally, we provide a deterministic algorithm that obtains a trade-off between the number of rounds, $R$, and the storage per machine. - For the insertion-only streaming model, we present an algorithm and a tight lower bound to support it. - We also discuss the dynamic streaming model, which allows both insertions and deletions in the data stream. In this model, we present the first algorithm and a lower bound. - Finally, we consider the sliding window model, where we are interested in maintaining an $(Ξ΅,k,z)$-coreset for the last $W$ points in the stream, we present a tight lower bound that confirms the optimality of the previous work by De Berg, Monemizadeh, and Zhong (ESA2020).
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted