R.I.P.
π»
Ghosted
Unified schemes for directive-based GPU offloading
November 28, 2024 Β· Declared Dead Β· π IEEE Access
Authors
Yohei Miki, Toshihiro Hanawa
arXiv ID
2411.18889
Category
cs.DC: Distributed Computing
Cross-listed
astro-ph.IM,
cs.PF,
cs.PL
Citations
0
Venue
IEEE Access
Repository
https://github.com/ymiki-repo/solomon}
Last Checked
2 months ago
Abstract
GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes originally developed for multicore CPUs. Although OpenACC and OpenMP target provide similar features, both methods have pros and cons. OpenACC has better functions and an abundance of documents, but it is virtually for NVIDIA GPUs. OpenMP target supports NVIDIA/AMD/Intel GPUs but has fewer functions than OpenACC. Here, we have developed a header-only library, Solomon (Simple Off-LOading Macros Orchestrating multiple Notations), to unify the interface for GPU offloading with the support of both OpenACC and OpenMP target. Solomon provides three types of notations to reduce users' implementation and learning costs: intuitive notation for beginners and OpenACC/OpenMP-like notations for experienced developers. This manuscript denotes Solomon's implementation and usage and demonstrates the GPU-offloading in $N$-body simulation and the three-dimensional diffusion equation. The library and sample codes are provided as open-source software and publicly and freely available at \url{https://github.com/ymiki-repo/solomon}.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Distributed Computing
R.I.P.
π»
Ghosted
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
R.I.P.
π»
Ghosted
Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains
R.I.P.
π»
Ghosted
Reproducing GW150914: the first observation of gravitational waves from a binary black hole merger
R.I.P.
π»
Ghosted
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
R.I.P.
π»
Ghosted
Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems
Died the same way β π 404 Not Found
R.I.P.
π
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
π
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
π
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
π
404 Not Found