Toward a GPU-enabled billionaire SVD in pyLOM

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We develop and implement an accelerated high-performance and open-source computing environment for model order reduction in fluid dynamics called pyLOM. It contains singular value decomposition-based algorithms implemented for massively parallel GPU architectures. The library is profiled in detail under the MareNostrum V supercomputer. The largest case has been computed under 20 s with 100 GPUs and consisted of a billion nodes by a thousand snapshots matrix. A hybrid CPU-GPU parallel randomized QR factorization has been found to be able to leverage such large matrices. The largest speedup factor of 83 has been found on the QR factorization, while the matrix–matrix multiplication has shown a speedup factor of about 2. Additionally, two examples of application are provided in the flow around a cylinder and the Windsor body, whose POD is computed under 3 s with 100 GPUs. This showcases the efficiency of GPUs, resulting in a 97% reduction in energy to solution and a reduction of 0.11 kg of C O 2 emissions. The scalability and efficiency achieved suggest that this framework can play a key role in reducing the energy demands and environmental impact of large-scale data analysis and model order reduction across a wide range of applications.

Article activity feed