Big Data Analytics in Bioinformatics: A Machine Learning Perspective
June 15, 2015 ยท Declared Dead ยท ๐ arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Hirak Kashyap, Hasin Afzal Ahmed, Nazrul Hoque, Swarup Roy, Dhruba Kumar Bhattacharyya
arXiv ID
1506.05101
Category
cs.CE: Computational Engineering
Cross-listed
cs.LG
Citations
93
Venue
arXiv.org
Last Checked
1 month ago
Abstract
Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big data using the distributed and parallel computing technologies. Usually big data tools perform computation in batch-mode and are not optimized for iterative processing and high data dependency among operations. In the recent years, parallel, incremental, and multi-view machine learning algorithms have been proposed. Similarly, graph-based architectures and in-memory big data tools have been developed to minimize I/O cost and optimize iterative processing. However, there lack standard big data architectures and tools for many important bioinformatics problems, such as fast construction of co-expression and regulatory networks and salient module identification, detection of complexes over growing protein-protein interaction data, fast analysis of massive DNA, RNA, and protein sequence data, and fast querying on incremental and heterogeneous disease networks. This paper addresses the issues and challenges posed by several big data problems in bioinformatics, and gives an overview of the state of the art and the future research opportunities.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computational Engineering
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
A Probabilistic Graphical Model Foundation for Enabling Predictive Digital Twins at Scale
R.I.P.
๐ป
Ghosted
Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis
R.I.P.
๐ป
Ghosted
Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data
R.I.P.
๐ป
Ghosted
Deep Dynamical Modeling and Control of Unsteady Fluid Flows
R.I.P.
๐ป
Ghosted
Design and Optimization of Conforming Lattice Structures
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted