EPR-dictionaries: A practical and fast data structure for constant time searches in unidirectional and bidirectional FM-indices
August 08, 2016 · Declared Dead · 🏛 Annual International Conference on Research in Computational Molecular Biology
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Christopher Pockrandt, Marcel Ehrhardt, Knut Reinert
arXiv ID
1608.02413
Category
cs.DS: Data Structures & Algorithms
Citations
13
Venue
Annual International Conference on Research in Computational Molecular Biology
Last Checked
3 months ago
Abstract
We introduce a new, practical method for conducting an exact search in a uni- and bidirectional FM index in $O(1)$ time per step while using $O(\log σ* n) + o(\log σ* σ* n)$ bits of space. This is done by replacing the binary wavelet tree by a new data structure, the Enhanced Prefixsum Rank dictionary (EPR-dictionary). We implemented this method in the SeqAn C++ library and experimentally validated our theoretical results. In addition we compared our implementation with other freely available implementations of bidirectional indices and show that we are between $\approx 2.6-4.8$ times faster. This will have a large impact for many bioinformatics applications that rely on practical implementations of (2)FM indices e.g. for read mapping. To our knowledge this is the first implementation of a constant time method for a search step in 2FM indices.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Data Structures & Algorithms
📚
📚
The Cartographer
R.I.P.
👻
Ghosted
Route Planning in Transportation Networks
R.I.P.
👻
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
👻
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
👻
Ghosted
Graph Isomorphism in Quasipolynomial Time
📚
📚
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way — 👻 Ghosted
R.I.P.
👻
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
👻
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
👻
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
👻
Ghosted