The number of scientific applications using GPUs for accelerated compute is steadily growing, which adds new programming models and practices into play. Moreover, ML applications present new parallelization strategies and communication libraries new to HPC environments. These factors require a revision of the tools and methodologies that we use for performance analysis, while still applying our perspective for scalability and interest for detail.
In this webinar, we will show you how you can gain additional performance insight from Nvidia nsys traces with Paraver, by translating them with our tool nsys2prv. By enabling the visualization of these traces on one of POP's performance analysis tools, we can navigate through scales more easily and display new and interesting performance metrics, apply efficiency models, and compare and quantify differences between different executions, objects, or timeline regions.