RLZAP: Relative Lempel-Ziv with Adaptive Pointers

May 14, 2016 Β· Declared Dead Β· πŸ› SPIRE

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Anthony J. Cox, Andrea Farruggia, Travis Gagie, Simon J. Puglisi, Jouni SirΓ©n arXiv ID 1605.04421 Category cs.DS: Data Structures & Algorithms Citations 10 Venue SPIRE Last Checked 4 months ago
Abstract
Relative Lempel-Ziv (RLZ) is a popular algorithm for compressing databases of genomes from individuals of the same species when fast random access is desired. With Kuruppu et al.'s (SPIRE 2010) original implementation, a reference genome is selected and then the other genomes are greedily parsed into phrases exactly matching substrings of the reference. Deorowicz and Grabowski (Bioinformatics, 2011) pointed out that letting each phrase end with a mismatch character usually gives better compression because many of the differences between individuals' genomes are single-nucleotide substitutions. Ferrada et al. (SPIRE 2014) then pointed out that also using relative pointers and run-length compressing them usually gives even better compression. In this paper we generalize Ferrada et al.'s idea to handle well also short insertions, deletions and multi-character substitutions. We show experimentally that our generalization achieves better compression than Ferrada et al.'s implementation with comparable random-access times.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted