![]() As of the time of writing, the PIConGPU application has limited use for features of NSight Systems, so this report will mainly focus on insights garnered from NSight Compute. This execution can also be used for baseline analysis on AMD MI50/ MI60 systems. The Traveling Wave Electron Acceleration (TWEAC) science case used in this run is a representative science case for PIConGPU. This analysis was performed using a grid size of 240 x 272 x 224, and 10 time steps with the Mid-November Figure of Merit (FOM) run setup. In this report, we measure single GPU metrics for the three kernels, offer high level takeaways from the conducted analysis, and compare the profiling data from NSight Compute to that of NVProf. The Current xi Deposition kernel and Particle Push kernel both set up the particle attributes for running any physics simulation with PIConGPU, so it is crucial to improve the performance of these two kernels. Three kernels, Current Deposition (also known as Compute Current), Particle Push (Move and Mark), and Shift Particles are known to be some of the most time-consuming kernels in PIConGPU. The primary goal of this report is to focus on the evaluation of PIConGPU’s most time-intensive kernels using NVProf and NSight Suite. Additionally, the engineers wanted to take a closer look at the newest NVIDIA profiling tools which allows us to identify the most useful features on these tools and will provide an opportunity to compare it to new AMD and Cray’s performance analysis tool releases and provide feedback to our vendor partners on what features are most important and mission critical for CAAR efforts. Any bottlenecks that are observed via performance profiling on Summit are likely to also impact scalability on the Frontier-dev system and the Frontier Early Access (EA) system. To this effect, performance engineers on the PIConGPU team wanted to dive deep into the application to understand at the finest granularity, which portions of the code could be further optimized to exploit the hardware on Summit at it’s maximum potential and also to elucidate which key kernels should be tracked and optimized for the CAAR effort to port this code to Frontier. PIConGPU has been selected as one of the the eight applications for OLCF’s coveted Center for Accelerated Application Readiness (CAAR) program aimed at the facility’s Frontier supercomputer (OLCF’s first exascale system to launch in 2021), to partner with our vendors (primary vendors: AMD and Cray/HPE) ensuring that Frontier will be able to perform large-scale science when it opens to users in 2022. PIConGPU is a highly optimized application that runs production jobs at scale on a system Oak more » Ridge Leadership Facility’s (OLCF) Summit supercomputer (using the full machine at 4600 nodes at 98% of GPU utilization on all ~28000 NVIDIA Volta GPUs). While PIConGPU has been optimized for at least 5 years to run well on NVIDIA GPU-based clusters, there has been limited exploration by the development team of potential scalability bottlenecks using recently updated and new tools including NVIDIA’s NVProf tool and the brand-new NVIDIA NSight Suite (Systems and Compute) tools. PIConGPU, Particle In Cell on GPUs, is an open source simulations framework for plasma and laser-plasma physics used to develop advanced particle accelerators for radiation therapy of cancer, high energy physics and photon science. Our experiences point to an expanding arena for GPU vendors in HPC for molecular = , We find that in general, performance is competitive and installation is straightforward, even at these early stages in a new GPU ecosystem. These programs are used extensively in industry for pharmaceutical and materials research, as well as academia, and are also frequently deployed on high-performance computing (HPC) systems, including national leadership HPC resources. ![]() Here we test the ports of several widely used molecular dynamics packages that have each made substantial use of acceleration with NVIDIA GPUs, on Spock, the early Cray pre-Frontier testbed system at the OLCF which employs AMD GPUs. The future LUMI supercomputer in Finland will be based on an HPE Cray EX platform as well. The Hewlett Packard Enterprise (HPE) Cray EX Frontier supercomputer installed at the Oak Ridge Leadership Computing Facility (OLCF) will provide an exascale resource for open science, and will feature graphics processing units (GPUs) from Advanced Micro Devices (AMD). Simulating molecular dynamics requires extremely rapid cal- culations to enable sufficient sampling of simulated temporal molecular processes. Molecular simulation is an important tool for nu- merous efforts in physics, chemistry, and the biological sciences.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |