Search for a command to run...
DLR has continued to improve performance and scalability of the Computational Fluid Dynamics (CFD) software CODA, the FlowSimulator framework and the sparse linear systems solver Spliss. This includes an evaluation of CODA’s improved scalability, of the newly introduced mixed-precision mode in Spliss, and of the newly developed hierarchical mesh partition method in FlowSimulator. Next to that CODA’s containerised delivery was studied and CODA was ported and tested on various upcoming Central Processing Unit (CPU) and Graphics Processing Unit (GPU) architectures. CERFACS has worked on expanding the existing GPU port of AVBP in terms of use cases coverage, supported architectures (with a strong focus on Advanced Micro Devices (AMD) GPUs) and general optimisation of the structure of the code to make it more efficient when offloaded to GPUs. RWTH continued to improve the performance and parallel efficiency for large-scale multiphysics simulations with the code m-AIA. A large-scale use case has been executed with high efficiency on the 4096 compute nodes demonstrating exascale readiness of the m-AIA code on CPU-based High-Performance Computing (HPC) system. Porting efforts to adapt m-AIA to GPU/Accelerated Processing Unit (APU) architectures are advancing at a high pace. Benchmarking on four EuroHPC systems has been carried out. BSC has focused on the GPU offloading of Alya using directive-based programming with OpenACC to minimise code changes while maintaining a unified codebase for both CPU and GPU targets. A first version of the code that can run incompressible Navier-Stokes problems fully on the GPU was obtained. The GPU performance was analysed and improved. In task 3.4, we pursued further work on the integration of Alya with the malleability framework Dynamic Malleability Runtime (DMR) to enable physics simulations which can resize at runtime to operate inside a desired efficiency range. CINECA and URMLS have completed the rewriting of the FLEW code as part of the STREAmS-2 code. STREAmS-2 is based on an object-oriented architecture with support for different computational backends. The code for the different computational backends is generated through an in-house portability library that has been extended to integrate the new code features. The code has been benchmarked on different HPC systems with special focus on Leonardo and LUMI clusters. An initial pipeline for Continuous Benchmarking was also implemented. Several features for workflow improvement in exascale perspective were also implemented. The focus of Neko was on improving the compressible solver and enabling efficient GPU-to-GPU communication using the NCCL library. Strong scaling tests on AMD and NVIDIA GPUs showed good parallel efficiency. Neko also benefited from vectorisation optimisations and memory access tuning, which demonstrated strong performance potential on architectures with high-bandwidth memory. During the period, the UL team extended the Further Application FA-1 case to use distributed memory architectures by porting L2G and Raysect with OpenMPI library. L2G uses hybrid parallelisation with OpenMPI and OpenMP, and Raysect employs OpenMPI and Python multiprocessing. Strong scaling benchmarks were performed for all three codes (focusing on ITER and WEST reactor scenarios). Some scalability is observed, but performance is not optimal and the future work will focus on the optimisation improvement. Preliminary results on co-design indicate that High Bandwidth Memory (HBM) can significantly benefit certain codes within EXCELLERAT P2, particularly when combined with Double Data Rate (DDR) memory on systems like Rhea. Ongoing and future work focuses on code classification via Roofline analysis, with the goal of enabling targeted optimisations based on performance profiles. Task 3.3 focused on developing a unified testing platform for validation, deployment, and benchmarking. Key efforts have included integrating tools from the CASTIEL 2 project for Alya and AVBP codes and creating an automated testing pipeline for the previously unsupported STREAmS application.