copMEM: Finding maximal exact matches via sampling both genomes
May 22, 2018 ยท Entered Twilight ยท ๐ Bioinform.
"Last commit was 6.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: CopMEM.src, MemoryFill.src, copmem.sln, demo.cmd, demo.sh, license, makefile, readme.txt
Authors
Szymon Grabowski, Wojciech Bieniecki
arXiv ID
1805.08816
Category
cs.DS: Data Structures & Algorithms
Cross-listed
q-bio.GN
Citations
15
Venue
Bioinform.
Repository
https://github.com/wbieniec/copmem
โญ 1
Last Checked
1 month ago
Abstract
Genome-to-genome comparisons require designating anchor points, which are given by Maximum Exact Matches (MEMs) between their sequences. For large genomes this is a challenging problem and the performance of existing solutions, even in parallel regimes, is not quite satisfactory. We present a new algorithm, copMEM, that allows to sparsely sample both input genomes, with sampling steps being coprime. Despite being a single-threaded implementation, copMEM computes all MEMs of minimum length 100 between the human and mouse genomes in less than 2 minutes, using less than 10 GB of RAM memory.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Data Structures & Algorithms
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
๐ป
Ghosted
Route Planning in Transportation Networks
R.I.P.
๐ป
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
๐ป
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
๐ป
Ghosted