Accelerating Biclique Counting on GPU
March 12, 2024 Β· Declared Dead Β· π IEEE International Conference on Data Engineering
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Linshan Qiu, Zhonggen Li, Xiangyu Ke, Lu Chen, Yunjun Gao
arXiv ID
2403.07858
Category
cs.DC: Distributed Computing
Citations
2
Venue
IEEE International Conference on Data Engineering
Last Checked
4 months ago
Abstract
Counting (p,q)-bicliques in bipartite graphs poses a foundational challenge with broad applications, from densest subgraph discovery in algorithmic research to personalized content recommendation in practical scenarios. Despite its significance, current leading (p,q)-biclique counting algorithms fall short, particularly when faced with larger graph sizes and clique scales. Fortunately, the problem's inherent structure, allowing for the independent counting of each biclique starting from every vertex, combined with a substantial set intersections, makes it highly amenable to parallelization. Recent successes in GPU-accelerated algorithms across various domains motivate our exploration into harnessing the parallelism power of GPUs to efficiently address the (p,q)-biclique counting challenge. We introduce GBC (GPU-based Biclique Counting), a novel approach designed to enable efficient and scalable (p,q)-biclique counting on GPUs. To address major bottleneck arising from redundant comparisons in set intersections (occupying an average of 90% of the runtime), we introduce a novel data structure that hashes adjacency lists into truncated bitmaps to enable efficient set intersection on GPUs via bit-wise AND operations. Our innovative hybrid DFS-BFS exploration strategy further enhances thread utilization and effectively manages memory constraints. A composite load balancing strategy, integrating pre-runtime and runtime workload allocation, ensures equitable distribution among threads. Additionally, we employ vertex reordering and graph partitioning strategies for improved compactness and scalability. Experimental evaluations on eight real-life and two synthetic datasets demonstrate that GBC outperforms state-of-the-art algorithms by a substantial margin. In particular, GBC achieves an average speedup of 497.8x, with the largest instance achieving a remarkable 1217.7x speedup when p = q = 8.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Distributed Computing
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Reproducing GW150914: the first observation of gravitational waves from a binary black hole merger
R.I.P.
π»
Ghosted
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
R.I.P.
π»
Ghosted
Adaptive Federated Learning in Resource Constrained Edge Computing Systems
R.I.P.
π»
Ghosted
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
R.I.P.
π»
Ghosted
iFogSim: A Toolkit for Modeling and Simulation of Resource Management Techniques in Internet of Things, Edge and Fog Computing Environments
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted