Search for a command to run...
• A geometric multigrid preconditioner is proposed for a Bi-CGSTAB solver applied in the ISPH method for multi-GPU environments. • The first restriction operation is conducted from the free-moving particles to a background cell followed by standard coarsening restrictions. • We proposed several techniques on multi-GPU programming including communication-computation overlap and communication-hiding. • The computational efficiency was evaluated in 3 supercomputer clusters yielding speedups of about 2.3 times and weak scalability around 0.85. • The method’s potential was verified by simulating the Fukushima Daiichi nuclear power plant tsunami with 97 million particles in 16 GPUs. This paper presents the implementation of a multigrid-preconditioned asymmetric solver within the context of Incompressible Smoothed Particle Hydrodynamics (ISPH) method. The restriction operator maps particles to the background grid used for neighbor searching, followed by standard grid coarsening. Key contributions include the adjustment of the multigrid preconditioner to multi-GPU cluster environments, incorporating communication-computation overlap and communication-hiding techniques, and adapting a dynamic load balancer to respect the coarse-grid hierarchy for efficient domain decomposition using the slice-grid technique. The present work also shows the evaluation of the computational efficiency of the proposed method through dam-break simulations on three supercomputers–JAMSTEC’s Earth Simulator, Tongji University’s supercomputer, and the University of Tokyo’s Miyabi)–demonstrating weak-scaling efficiencies between 0.82 and 0.89. The multigrid preconditioner substantially improved solver performance, reducing the average number of iteration for solving the pressure from over 300 to nearly 60 across problem sizes ranging from 5 to 320 million particles, resulting in an overall speedup of approximately 2.3 times. Furthermore, we demonstrate the method’s capability for real-world engineering applications through the simulation of the 2011 tsunami inundation at the Fukushima Daiichi Nuclear Power Plant, involving 97 million particles on 32 GPUs. While the preconditioner proves highly effective in accelerating solver convergence, the study concludes by identifying limitations associated with coarse-grid constraints on load balancing and communication overhead, providing clear directions for future research.
Published in: Computer Methods in Applied Mechanics and Engineering
Volume 453, pp. 118842-118842