Document Listing on Repetitive Collections with Guaranteed Performance

July 20, 2017 Β· Declared Dead Β· πŸ› Annual Symposium on Combinatorial Pattern Matching

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Gonzalo Navarro arXiv ID 1707.06374 Category cs.DS: Data Structures & Algorithms Citations 24 Venue Annual Symposium on Combinatorial Pattern Matching Last Checked 3 months ago
Abstract
We consider document listing on string collections, that is, finding in which strings a given pattern appears. In particular, we focus on repetitive collections: a collection of size $N$ over alphabet $[1,σ]$ is composed of $D$ copies of a string of size $n$, and $s$ edits are applied on ranges of copies. We introduce the first document listing index with size $\tilde{O}(n+s)$, precisely $O((n\logσ+s\log^2 N)\log D)$ bits, and with useful worst-case time guarantees: Given a pattern of length $m$, the index reports the $\ndoc>0$ strings where it appears in time $O(m\log^{1+Ρ} N \cdot \ndoc)$, for any constant $Ρ>0$ (and tells in time $O(m\log N)$ if $\ndoc=0$). Our technique is to augment a range data structure that is commonly used on grammar-based indexes, so that instead of retrieving all the pattern occurrences, it computes useful summaries on them. We show that the idea has independent interest: we introduce the first grammar-based index that, on a text $T[1,N]$ with a grammar of size $r$, uses $O(r\log N)$ bits and counts the number of occurrences of a pattern $P[1,m]$ in time $O(m^2 + m\log^{2+Ρ} r)$, for any constant $Ρ>0$. We also give the first index using $O(z\log(N/z)\log N)$ bits, where $T$ is parsed by Lempel-Ziv into $z$ phrases, counting occurrences in time $O(m\log^{2+Ρ} N)$.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted