Merging Sorted Lists of Similar Strings

August 19, 2022 ยท Declared Dead ยท ๐Ÿ› Annual Symposium on Combinatorial Pattern Matching

๐Ÿ’€ CAUSE OF DEATH: 404 Not Found
Code link is broken/dead
Authors Gene Myers arXiv ID 2208.09351 Category cs.DS: Data Structures & Algorithms Citations 0 Venue Annual Symposium on Combinatorial Pattern Matching Repository https://github.com/thegenemyers/STRING.HEAP Last Checked 1 month ago
Abstract
Merging $T$ sorted, non-redundant lists containing $M$ elements into a single sorted, non-redundant result of size $N \ge M/T$ is a classic problem typically solved practically in $O(M \log T)$ time with a priority-queue data structure the most basic of which is the simple *heap*. We revisit this problem in the situation where the list elements are *strings* and the lists contain many *identical or nearly identical elements*. By keeping simple auxiliary information with each heap node, we devise an $O(M \log T+S)$ worst-case method that performs no more character comparisons than the sum of the lengths of all the strings $S$, and another $O(M \log (T/ \bar e)+S)$ method that becomes progressively more efficient as a function of the fraction of equal elements $\bar e = M/N$ between input lists, reaching linear time when the lists are all identical. The methods perform favorably in practice versus an alternate formulation based on a trie.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Data Structures & Algorithms

Died the same way โ€” ๐Ÿ’€ 404 Not Found