Aligning coding sequences with frameshift extension penalties
October 27, 2016 Β· Entered Twilight Β· π Algorithms for Molecular Biology
"Last commit was 9.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, examples, resources, src
Authors
Safa Jammali, Esaie Kuitche, Ayoub Rachati, FranΓ§ois BΓ©langer, Michelle Scott, AΓ―da Ouangraoua
arXiv ID
1610.08809
Category
cs.DS: Data Structures & Algorithms
Cross-listed
q-bio.GN,
q-bio.QM
Citations
1
Venue
Algorithms for Molecular Biology
Repository
https://github.com/UdeS-CoBIUS/FsePSA
β 2
Last Checked
2 months ago
Abstract
Frameshift translation is an important phenomenon that contributes to the appearance of novel Coding DNA Sequences (CDS) and functions in gene evolution, by allowing alternative amino acid translations of genes coding regions. Frameshift translations can be identified by aligning two CDS, from a same gene or from homologous genes, while accounting for their codon structure. Two main classes of algorithms have been proposed to solve the problem of aligning CDS, either by amino acid sequence alignment back-translation, or by simultaneously accounting for the nucleotide and amino acid levels. The former does not allow to account for frameshift translations and up to now, the latter exclusively accounts for frameshift translation initiation, not accounting for the length of the translation disruption caused by a frameshift. Here, we introduce a new scoring scheme with an algorithm for the pairwise alignment of CDS accounting for frameshift translation initiation and length, while simultaneously accounting for nucleotide and amino acid sequences. We compare the method to other CDS alignment methods based on an application to the comparison of pairs of CDS from homologous \emph{human}, \emph{mouse} and \emph{cow} genes of ten mammalian gene families from the Ensembl-Compara database. The results show that our method is particularly robust to parameter changes as compared to existing methods. It also appears to be a good compromise, performing well both in the presence and absence of frameshift translations between the CDS. An implementation of the method is available at https://github.com/UdeS-CoBIUS/FsePSA.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted