topFiberM: Scalable and Efficient Boolean Matrix Factorization
March 06, 2019 Β· Entered Twilight Β· π arXiv.org
"Last commit was 6.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: GreCond.R, README.md, data, test_GreCond.R, test_topFiberM_exp.R, topFiberM_exp.R
Authors
Abdelmoneim Amer Desouki, Michael RΓΆder, Axel-Cyrille Ngonga Ngomo
arXiv ID
1903.10326
Category
cs.DS: Data Structures & Algorithms
Citations
1
Venue
arXiv.org
Repository
https://github.com/dice-group/BMF
β 3
Last Checked
2 months ago
Abstract
Matrix Factorization has many applications such as clustering. When the matrix is Boolean it is favorable to have Boolean factors too. This will save the efforts of quantizing the reconstructed data back, which usually is done using arbitrary thresholds. Here we introduce topFiberM a Boolean matrix factorization algorithm. topFiberM chooses in a greedy way the fibers (rows or columns) to represent the entire matrix. Fibers are extended to rectangles according to a threshold on precision. The search for these "top fibers" can continue beyond the required rank and according to an optional parameter that defines the limit for this search. A factor with a better gain replaces the factor with minimum gain in "top fibers". We compared topFiberM to the state-of-the-art methods, it achieved better quality for the set of datasets usually used in literature. We also applied our algorithm to linked-data to show its scalability. topFiberM was in average 128 times faster than the well known Asso method when applied to a set of matrices representing a real multigraph although Asso is implemented in C and topFiberM is implemented in R which is generally slower than C. topFiberM is publicly available from Github (https://github.com/dice-group/BMF).
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted