11 June 2024
Forschungszentrum Jülich
Europe/Berlin timezone
Registration closed

Tutorial: HPC for Researchers & Accelerating massive data processing in Python with Heat

Vytautas Jančauskas1, Daniela Espinoza Molina1, Antony Zappacosta1, Roman Zitlau1, Fabian Hoppe2, Kai Krajsek3, Claudia Comito3

1 Deutsches Zentrum für Luft- und Raumfahrt e.V., (DLR) Remote Sensing Technology Institute EO Data Science Oberpfaffenhofen 
2 Deutsches Zentrum für Luft- und Raumfahrt e.V., Institut für Softwaretechnologie, High-Performance Computing, Köln
3 Forschungszentrum Jülich GmbH, Institute for Advanced Simulation, Jülich Supercomputing Centre

 

When: 10:15 - 18:00
Where: JSC meeting room 2, building 16.3, room 315 [coordinates]
 

Part I: HPC for Researchers

An introduction to HPC for reasearchers using Python. We will be using the HAICORE platform as an example. The "Helmholtz AI computing resources" (HAICORE) provide easy and low-barrier GPU access to the entire AI community within the Helmholtz Association. In this tutorial you will learn to:

  • Gain access to the platform, set up 2FA and log-in.
  • Understand basic HPC concepts (distributed computing, etc.)
  • Set up your own software environment using conda.
  • Request and use GPU and CPU resources through SLURM.
  • Set up and use Dask to distribute your data science workflows.
  • Accelerate your software with Numba.
  • Write custom CUDA kernels in Python.

Latest details and requirements on github [link]
 

Part II: Accelerating massive data processing in Python with Heat

Many data processing workflows in science and technology build on Python libraries like NumPy, SciPy, scikit learn etc., that are easy-to-learn and easy-to-use. In addition, these libraries are based on highly optimized computational backends and thus allow to achieve quite a competitive performance --- at least as long as no GPU-acceleration is taken into account and as long as the memory of a single workstation/cluster-node is sufficient for all required tasks.
However, in the presence of steadily growing data sets the limitation to the RAM of a single machine may pose a severe obstacle. Nevertheless, the step from a workstation to a (GPU-)cluster can be challenging for domain experts without prior HPC-experience.

This group of users is targeted by our Python library Heat ("Helmholtz Analytics Toolkit") to which we want to give a brief hands-on introduction in this tutorial. Our library builds on PyTorch and mpi4py and simplifies porting of NumPy/SciPy-based code to GPU (CUDA, ROCm), including multi-GPU, multi-node clusters. On the surface, Heat implements a NumPy-like API, is largely interoperable with the Python array ecosystem, and can be employed seamlessly as a backend to accelerate existing single-CPU pipelines, as well as to develop new HPC-applications from scratch. Under the hood, Heat distributes memory-intensive operations and algorithms via MPI-communication and thus avoids some of the overhead that is often introduced by different, task-parallelism-based libraries for scaling NumPy/SciPy/scikit-learn applications.

In this tutorial you will get an overview of:

  • Heats basics: getting started with distributed I/O, data decomposition scheme, array operations
  • Existing functionalities: multi-node linear algebra, statistics, signal processing, machine learning…
  • DIY how-to: using existing Heat infrastructure to build your own multi-node, multi-GPU research software.
  • We will also touch upon Heat's implementation roadmap, and possible paths to collaboration.

Prerequisites

  • notebook with ssh client
  • access to HIFIS resources (we will use HAICORE@KIT): given through Helmholtz AAAI
  • privacyIDEA app on smartphone (to set up 2FA)

Latest details on github [link]