Accelerating Fortran Codes: A Method for Integrating Coarray Fortran with CUDA Fortran and OpenMP

September 03, 2024 · Declared Dead · 🏛 J. Parallel Distributed Comput.

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors James McKevitt, Eduard I. Vorobyov, Igor Kulikov arXiv ID 2409.02294 Category astro-ph.IM Cross-listed astro-ph.SR, cs.DC, cs.PL Citations 4 Venue J. Parallel Distributed Comput. Last Checked 1 month ago

Abstract

Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the development of new high-performance codes. Coarray Fortran (CAF), part of the Fortran 2008 standard introduced for parallel programming, facilitates distributed memory parallelism with a syntax familiar to Fortran programmers, simplifying the transition from single-processor to multi-processor coding. This research focuses on innovating and refining a parallel programming methodology that fuses the strengths of Intel Coarray Fortran, Nvidia CUDA Fortran, and OpenMP for distributed memory parallelism, high-speed GPU acceleration and shared memory parallelism respectively. We consider the management of pageable and pinned memory, CPU-GPU affinity in NUMA multiprocessors, and robust compiler interfacing with speed optimisation. We demonstrate our method through its application to a parallelised Poisson solver and compare the methodology, implementation, and scaling performance to that of the Message Passing Interface (MPI), finding CAF offers similar speeds with easier implementation. For new codes, this approach offers a faster route to optimised parallel computing. For legacy codes, it eases the transition to parallel computing, allowing their transformation into scalable, high-performance computing applications without the need for extensive re-design or additional syntax.