Sparse Suffix Tree Construction in Optimal Time and Space
August 02, 2016 Β· Declared Dead Β· π ACM-SIAM Symposium on Discrete Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
PaweΕ Gawrychowski, Tomasz Kociumaka
arXiv ID
1608.00865
Category
cs.DS: Data Structures & Algorithms
Citations
24
Venue
ACM-SIAM Symposium on Discrete Algorithms
Last Checked
3 months ago
Abstract
Suffix tree (and the closely related suffix array) are fundamental structures capturing all substrings of a given text essentially by storing all its suffixes in the lexicographical order. In some applications, we work with a subset of $b$ interesting suffixes, which are stored in the so-called sparse suffix tree. Because the size of this structure is $Ξ(b)$, it is natural to seek a construction algorithm using only $O(b)$ words of space assuming read-only random access to the text. We design a linear-time Monte Carlo algorithm for this problem, hence resolving an open question explicitly stated by Bille et al. [TALG 2016]. The best previously known algorithm by I et al. [STACS 2014] works in $O(n\log b)$ time. Our solution proceeds in $n/b$ rounds; in the $r$-th round, we consider all suffixes starting at positions congruent to $r$ modulo $n/b$. By maintaining rolling hashes, we lexicographically sort all interesting suffixes starting at such positions, and then we merge them with the already considered suffixes. For efficient merging, we also need to answer LCE queries in small space. By plugging in the structure of Bille et al. [CPM 2015] we obtain $O(n+b\log b)$ time complexity. We improve this structure, which implies a linear-time sparse suffix tree construction algorithm. We complement our Monte Carlo algorithm with a deterministic verification procedure. The verification takes $O(n\sqrt{\log b})$ time, which improves upon the bound of $O(n\log b)$ obtained by I et al. [STACS 2014]. This is obtained by first observing that the pruning done inside the previous solution has a rather clean description using the notion of graph spanners with small multiplicative stretch. Then, we are able to decrease the verification time by applying difference covers twice. Combined with the Monte Carlo algorithm, this gives us an $O(n\sqrt{\log b})$-time and $O(b)$-space Las Vegas algorithm.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted