Fast Label Extraction in the CDAWG
July 25, 2017 Β· Declared Dead Β· π SPIRE
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Djamal Belazzougui, Fabio Cunial
arXiv ID
1707.08197
Category
cs.DS: Data Structures & Algorithms
Citations
27
Venue
SPIRE
Last Checked
3 months ago
Abstract
The compact directed acyclic word graph (CDAWG) of a string $T$ of length $n$ takes space proportional just to the number $e$ of right extensions of the maximal repeats of $T$, and it is thus an appealing index for highly repetitive datasets, like collections of genomes from similar species, in which $e$ grows significantly more slowly than $n$. We reduce from $O(m\log{\log{n}})$ to $O(m)$ the time needed to count the number of occurrences of a pattern of length $m$, using an existing data structure that takes an amount of space proportional to the size of the CDAWG. This implies a reduction from $O(m\log{\log{n}}+\mathtt{occ})$ to $O(m+\mathtt{occ})$ in the time needed to locate all the $\mathtt{occ}$ occurrences of the pattern. We also reduce from $O(k\log{\log{n}})$ to $O(k)$ the time needed to read the $k$ characters of the label of an edge of the suffix tree of $T$, and we reduce from $O(m\log{\log{n}})$ to $O(m)$ the time needed to compute the matching statistics between a query of length $m$ and $T$, using an existing representation of the suffix tree based on the CDAWG. All such improvements derive from extracting the label of a vertex or of an arc of the CDAWG using a straight-line program induced by the reversed CDAWG.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted