ATTENTION: We have to do a short maintenance with downtime on Tue 18 Oct 2021, 9:00 - 10:00 CEST. Please finish your work in time to prevent data loss.

Introduction to Statistics



Axel Schumacher, Ines Reinartz (KIT), Sikha Ray (KIT), Thorsten Auth (Forschungszentrum Jülich)

Dates: 29.09./30.09./06.10./07.10.

Powered by Helmholtz Munich, IHRS BioSoft, BIF-IGS, and HIDSS4Health offer a four-day introductory course to statistics, aiming at participants working at the interfaces of biology with chemistry, physics, neuroscience, engineering, and data science. This introductory course combines an overview of basic statistical methods with their application in the software Python. Participants will learn how to start with their data analysis and interpret their results in a meaningful way. We will cover basic methods, such as descriptive statistics, hypothesis testing, linear regression, ANOVA, and their application using Python. This course does not require any previous knowledge of statistics. By the end of the course, you will be able to identify appropriate statistical methods, apply them, and interpret your results.

Workshop Content

  • Descriptive statistics
    • Levels of variables
    • Measures of tendency and variability (mean, median, variance, ...)
    • Classical statistical graphics and when to apply them (histogram, boxplots, violin plots, ...)
  • Random variables
    • Distribution of random variables
    • Characteristics of distributions
    • Confidence intervals
  • Hypothesis testing
    • How to apply tests
    • Classical statistical tests (t-test, ANOVA, ...)
    • When to apply which test
    • Multiple testing and corrections
  • Linear regression
    • The idea of linear regression
    • How to apply and interpret linear models
    • Limits of linear regression

The focus in all chapters is to understand when to apply which method, how to run them in Python, and how to interpret the output. Also, limitations and extensions of the methods are discussed.


  • Each day consists of blocks covering first the statistical theory behind the methods and their application in Python, and then hands-on examples with best-practice solutions.
  • The trainers are happy to answer questions during the talks and exercises.


  • Basic skills in programming with Python, such as those taught in our course "Python from Zero to Data Science". In detail, we require basic knowledge using the packages pandas, Matplotlib, Seaborn, and NumPy.
  • As the integrated development environment (IDE) we will use Spyder. If you use any other IDE that supports Python such as Thonny, or Jupyter notebooks, … you are welcome to use those as well.

Please note that the timings below are for guidance and may change to meet the needs of the course participants. Do block the entire dates in your diaries and be prepared that the lunch break may be shifted.