6–23 May 2025
online
Europe/Berlin timezone

sponsored by Helmholtz Information & Data Science Academy (HIDA)

in cooperation with Core Facility Statistical Consulting  at Helmholtz Zentrum München - German Research Center for Environmental Health (Helmholtz Munich)

Multivariate Statistics 1

The participants will learn when and how to apply unsupervised learning methods, such as PCA for dimension reduction, and clustering techniques like k-means, hierarchical clustering, and other hybrid approaches. The course also covers rotation techniques following dimensionality reduction, as well as mixture models, heatmaps, and more advanced clustering methods (DBSCAN, Louvain). The course content is designed to provide a foundational understanding of the theory behind multivariate analysis. Each topic is accompanied by hands-on exercises using the statistical software R. Participants are encouraged to ask questions and seek advice on analyzing their own datasets.

Topics:

This course on multivariate statistics covers two different topics:

  • Dimension reduction methods: This first chapter focuses more on principal component analysis (PCA), what is "under the hood", how many principal components to choose, how to visualize and interpret the results. A brief overview of rotation techniques and other unsupervised multivariate methods (e.g., for categorical variables, data structured into groups) is also part of the lecture.
  • Cluster analysis: This second chapter describes the different measures of dissimilarity and distances that can be used to define clusters. It focuses on the two most frequently used clustering methods: k-means and hierarchical clustering, and the combination of these two methods into hybrid algorithms. This chapter also covers the theory and application of mixture models as well as the R commands that permit to produce heatmaps together with the result of a clustering algorithm. Finally, two other clustering methods, namely DBSCAN and Louvain method for community detection, are introduced at the end of this lecture.

Methods:

Each day consists of blocks covering first the theory behind the methods and their applications in R. Theoretical lessons will be followed by hands-on examples with best-practice solutions.

Learning Goals

1. Understand and Apply Principal Component Analysis (PCA)

  • Describe the principles of PCA, how to determine the number of components to retain and how to interpret them.
  • Apply rotation techniques for better understanding of the components.
  • Use hands-on exercises to confidently apply PCA to real-world data using R.

2. Understand and Apply Clustering Methods

  • Choose appropriate dissimilarity measures for your data
  • Explore different clustering techniques, such as k-means, hierarchical clustering and some hybrid approaches.
  • Execute clustering methods in R using real-world data.

3. Explore Advanced Clustering Approaches

  • Understand mixture models, DBSCAN, and Louvain clustering to identify complex data structures and communities.
  • Create heatmaps to visualize clustering outcomes and enhance interpretability of multivariate analyses.

Prerequisites

Programming skills with R, e.g., course Introduction to R and basic knowledge of statistics, e.g., course Introduction to Statistics.

Target Group

This course is open to researchers of all career stages, or anyone interested in learning about the subject.

Course Days & Times

May 6, 2025, 9 am - 5 pm

May 15, 2025, 9 am - 12:30 pm

May 16, 2025, 9 am - 12:30 pm

May 22, 2025, 9 am - 12:30 pm

May 23, 2025, 9 am - 12:30 pm

 

NOTE: Registration will open April 8, 2025, 12 pm.

Attendance & Certificates 

The course content is coordinated, so we strongly recommend that you do not miss any part of the course. To receive a certificate we expect full time and active participation.

Registration & Cancellation

This course is open to individuals affiliated with Helmholtz or a HIDA Partner only.

Your registration for this course is binding. If you need to leave/miss the course for a period of time, please let us know in advance via hida-courses@helmholtz.de.

If you have to cancel the course for any reason, please do so as soon as possible to allow time for others to take your seat. To cancel, please withdraw your registration on the course site or write an email to hida-courses@helmholtz.de

Additional Information

There is no waiting list for this course! If someone withdraws from a course, their place is automatically reopened. We therefore advise you to keep an eye on the registration in case the course is fully booked and you would like to attend. Also, this course will be offered again in the future - you can check our HIDA course catalog for updates.  

This course is free of charge. 

Starts
Ends
Europe/Berlin
online