Beyond trace reconstruction: Population recovery from the deletion channel
April 11, 2019 ยท Declared Dead ยท ๐ IEEE Annual Symposium on Foundations of Computer Science
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Frank Ban, Xi Chen, Adam Freilich, Rocco A. Servedio, Sandip Sinha
arXiv ID
1904.05532
Category
cs.DS: Data Structures & Algorithms
Citations
40
Venue
IEEE Annual Symposium on Foundations of Computer Science
Last Checked
3 months ago
Abstract
\emph{Population recovery} is the problem of learning an unknown distribution over an unknown set of $n$-bit strings, given access to independent draws from the distribution that have been independently corrupted according to some noise channel. Recent work has intensively studied such problems both for the bit-flip and erasure noise channels. We initiate the study of population recovery under the \emph{deletion channel}, in which each bit is independently \emph{deleted} with some fixed probability and the surviving bits are concatenated and transmitted. This is a far more challenging noise model than bit-flip~noise or erasure noise; indeed, even the simplest case in which the population size is 1 (corresponding to a trivial distribution supported on a single string) corresponds to the \emph{trace reconstruction} problem, a challenging problem that has received much recent attention (see e.g.~\cite{DOS17,NP17,PZ17,HPP18,HHP18}). We give algorithms and lower bounds for population recovery under the deletion channel when the population size is some $\ell>1$. As our main sample complexity upper bound, we show that for any $\ell=o(\log n/\log \log n)$, a population of $\ell$ strings from $\{0,1\}^n$ can be learned under deletion channel noise using $\smash{2^{n^{1/2+o(1)}}}$ samples. On the lower bound side, we show that $n^{ฮฉ(\ell)}$ samples are required to perform population recovery under the deletion channel, for all $\ell \leq n^{1/2-ฮต}$. Our upper bounds are obtained via a robust multivariate generalization of a polynomial-based analysis, due to Krasikov and Roddity \cite{KR97}, of how the $k$-deck of a bit-string uniquely identifies the string; this is a very different approach from recent algorithms for trace reconstruction (the $\ell=1$ case). Our lower bounds build on moment-matching results of Roos~\cite{Roo00} and Daskalakis and Papadimitriou~\cite{DP15}.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Data Structures & Algorithms
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
๐ป
Ghosted
Route Planning in Transportation Networks
R.I.P.
๐ป
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
๐ป
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
๐ป
Ghosted
Graph Isomorphism in Quasipolynomial Time
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted