Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published September 2024 | Published
Journal Article

Method for scalable and performant GPU-accelerated simulation of multiphase compressible flow

Abstract

Multiphase compressible flows are often characterized by a broad range of space and time scales, entailing large grids and small time steps. Simulations of these flows on CPU-based clusters can thus take several wall-clock days. Offloading the compute kernels to GPUs appears attractive but is memory-bound for many finite-volume and -difference methods, damping speedups. Even when realized, GPU-based kernels lead to more intrusive communication and I/O times owing to lower computation costs. We present a strategy for GPU acceleration of multiphase compressible flow solvers that addresses these challenges and obtains large speedups at scale. We use OpenACC for directive-based offloading of all compute kernels while maintaining low-level control when needed. An established Fortran preprocessor and metaprogramming tool, Fypp, enables otherwise hidden compile-time optimizations. This strategy exposes compile-time optimizations and high memory reuse while retaining readable, maintainable, and compact code. Remote direct memory access realized via CUDA-aware MPI and GPUDirect reduces halo-exchange communication time. We implement this approach in the open-source solver MFC [1]. Metaprogramming results in an 8-times speedup of the most expensive kernels compared to a statically compiled program, reaching 46% of peak FLOPs on modern NVIDIA GPUs and high arithmetic intensity (about 10 FLOPs/byte). In representative simulations, a single NVIDIA A100 GPU is 7-times faster compared to an Intel Xeon Cascade Lake (6248) CPU die, or about 300-times faster compared to a single such CPU core. At the same time, near-ideal (97%) weak scaling is observed for at least 13824 GPUs on OLCF Summit. A strong scaling efficiency of 84% is retained for an 8-times increase in GPU count. Collective I/O, implemented via MPI3, helps ensure the negligible contribution of data transfers (<1% of the wall time for a typical, large simulation). Large many-GPU simulations of compressible (solid-)liquid-gas flows demonstrate the practical utility of this strategy.

Copyright and License

© 2024 Elsevier.

Acknowledgement

We acknowledge useful discussions of this work from Brent Leback and Mat Colgrove (NVIDIA), Stéphan Ethier (Princeton), Nicholson Koukpaizan (Oak Ridge National Lab), Pedro Costa (TU Delft), and Luca Brandt (KTH, Sweeden). SHB acknowledges the support of this work via the US Office of Naval Research under grant number N00014-22-1-2519 (PM Julie Young), hardware gifts from the NVIDIA Corporation, and use of OLCF Summit and Wombat under allocation CFD154. TC acknowledges support via the US Office of Naval Research under grant number N00014-22-1-2518 (PM Julie Young). This work used Bridges2 at the Pittsburgh Supercomputing Center through allocation PHY210084 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.

Contributions

Anand Radhakrishnan: Writing – review & editing, Writing – original draft, Software, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Henry Le Berre: Writing – original draft, Software, Methodology, Conceptualization. Benjamin Wilfong: Software, Methodology. Jean-Sebastien Spratt: Software. Mauro Rodriguez: Software. Tim Colonius: Supervision, Funding acquisition. Spencer H. Bryngelson: Writing – review & editing, Writing – original draft, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization.

Code Availability

All code available at https://github.com/MFlowCode/MFC.

Conflict of Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Spencer Bryngelson reports financial support was provided by Office of Naval Research.

Additional details

Created:
May 29, 2024
Modified:
May 29, 2024