Bit catastrophes for the Burrows-Wheeler Transform
April 16, 2024 Β· Declared Dead Β· π Theory of Computing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Sara Giuliani, Shunsuke Inenaga, Zsuzsanna LiptΓ‘k, Giuseppe Romana, Marinella Sciortino, Cristian Urbina
arXiv ID
2404.10426
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.DM
Citations
15
Venue
Theory of Computing Systems
Last Checked
3 months ago
Abstract
A bit catastrophe, loosely defined, is when a change in just one character of a string causes a significant change in the size of the compressed string. We study this phenomenon for the Burrows-Wheeler Transform (BWT), a string transform at the heart of several of the most popular compressors and aligners today. The parameter determining the size of the compressed data is the number of equal-letter runs of the BWT, commonly denoted $r$. We exhibit infinite families of strings in which insertion, deletion, resp. substitution of one character increases $r$ from constant to $Ξ(\log n)$, where $n$ is the length of the string. These strings can be interpreted both as examples for an increase by a multiplicative or an additive $Ξ(\log n)$-factor. As regards multiplicative factor, they attain the upper bound given by Akagi, Funakoshi, and Inenaga [Inf & Comput. 2023] of $O(\log n \log r)$, since here $r=O(1)$. We then give examples of strings in which insertion, deletion, resp. substitution of a character increases $r$ by a $Ξ(\sqrt{n})$ additive factor. These strings significantly improve the best known lower bound for an additive factor of $Ξ©(\log n)$ [Giuliani et al., SOFSEM 2021].
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted