Data Processing with Pandas and Visualization with Matplotlib

Europe/Berlin
online

online

Fredo Erxleben (Helmholtz-Zentrum Dresden-Rossendorf), Zoe La France (HIDA)
Description

organized in cooperation with Helmholtz Information & Data Science Academy (HIDA) and Helmholtz Federated IT Services (HIFIS) 

Data Processing with Pandas and Visualization with Matplotlib 

In this workshop you will learn how to employ the Python frameworks Pandas for loading, filtering and evaluating tabular data and matplotlib to generate paper-ready plots from this data. After an introduction of the basic concepts we will explore the necessary workflow from raw data to plot using a real-life dataset.

The course is taught in two alternating ways:

  • A live-coding lecture in which you will write the program along with your instructor while getting to know elements of the programming language and their use-cases

  • Hands-on exercises where you will solve posed tasks on your own, supported by the instructor for questions and feedback.

Learning Goals

By the end of the course, you will be familiar with the fundamental concepts of these frameworks and can employ them to create analysis and visualization scripts for your own datasets.

Prerequisites 

To participate in this course, you need have a good understanding of the fundamental concepts of the Python programming language and be familiar with the programming tool you are using. Understanding of the following concepts is beneficial and helps to focus more on the core content:

  • Object-oriented Programming

  • Generator Expressions

  • Regular Expressions (Regex)

  • Assignment Expressions

Target Group

Learners of all academic fields who have to work with datasets: cleaning, investigating, modifying and plotting them.

Course Days & Times

Apr 7, 2025, 10 am - 4 pm

Apr 8, 2025, 10 am - 4 pm

Apr 9, 2025, 10 am - 2 pm

 

NOTE: Registration will open March 10, 2025, 12 pm. 

Attendance & Certificates 

The course content is coordinated, so we strongly recommend that you do not miss any part of the course. To receive a certificate we expect at least 80% attendance and active participation.

Registration & Cancellation

This course is open to individuals affiliated with Helmholtz or a HIDA Partner only. You may register for the course allocating yourself to one of the following groups:

  1. All Helmholtz affiliations
  2. Helmholtz Information & Data Science School (HIDSS) affiliation
  3. HIDA Partner affiliation

Please note that after the first two weeks of the registration period the unbooked seats from categories 2 and 3 will be opened for all Helmholtz affiliations (category 1). 

Your registration for this course is binding. If you need to leave/miss the course for a period of time, please let us know in advance via hida-courses@helmholtz.de.

If you have to cancel the course for any reason, please do so as soon as possible to allow time for others to take your seat. To cancel, please withdraw your registration on the course site or write an email to hida-courses@helmholtz.de

Additional Information

There is no waiting list for this course! If someone withdraws from a course, their place is automatically reopened. We therefore advise you to keep an eye on the registration in case the course is fully booked and you would like to attend. Also, this course will be offered again in the future - you can check our HIDA course catalog for updates.  

This course is free of charge. 

    • 1
      Welcome & Organization
    • Lessons: Lesson I
      • 2
        Setting up a Python Project

        You learn how a basic project is set up and explore two approaches to Python programming: using the REPL and writing Python files.

      • 3
        Variables, Assignments and Data Types

        Get to know the basic constructs for storing and manipulating information in a program. Understand what data types are and how they influence how information is processed.

      • 4
        Importing

        Since projects often get distributed over multiple files or require code from other sources, we will investigate how to import code from other files or libraries.

    • 11:30
      Lunch
    • Lessons: Lesson II
      • 5
        Conditionals

        It is often necessary to check conditions and act accordingly. This section will cover expressing those conditions and how to control in which order they get checked and how to react to them.

      • 6
        While-Loops

        Loops are a good choice when it comes to repeating actions. In this section, the "while"-loop will be introduced as a method of repeating code based on condition.

    • 14:00
      Coffee
    • Exercises: Exercise I
      • 7
        Exercise: Basics

        In this exercise session we will write our first own programs to solve small problems. The focus is on gaining experience with the use of assignments, conditionals and loops and fostering structure-oriented thinking.

    • 8
      Recap from Day 1
    • Lessons: Lesson III
      • 9
        Functions

        Splitting parts of programs off into self contained, reusable blocks is a good way to handle complexity and allow for parts of a program to also be used in other projects.

      • 10
        For-Loops

        Introducing the second kind of loop, the "for"-loop is well suited to iterate over a set of data or repeat a set of instructions a given amount of times.

    • 11:30
      Lunch
    • Exercises: Exercise II
      • 11
        Exercise: Increased Complexity

        Further training the use of the basic structures to solve increasingly complex problems. Planning approaches to solve tasks that are increasingly hard to solve by "just doing it".

    • 14:00
      Coffee
    • Exercises: Exercise III
      • 12
        Exercise: Functions

        In addition to the basic concepts we will now also use functions to better structure and sub-divide our programs, enabling us to solve increasingly complex tasks.

    • 13
      Recap from Day 2
    • Lessons: Lesson IV
      • 14
        Tuples

        Tuples are a great way to bundle up multiple values. Learn how to employ them and take advantage of Python's automatic Packing/unpacking feature.

      • 15
        Lists

        Another very useful data type is the List, a sorted collection of data. In this section we introduce some basic functionality and learn where to find more detailed information for this data type and many others.

      • 16
        Finalizing the Project

        We will put some finishing touches on our example project to make it ready for a first release. Further, possible future learning paths will be outlined.

    • 11:30
      Lunch
    • Exercises: Exercise IV
      • 17
        Exercise: Larger Programs

        In this exercise part we will encounter increasingly complex tasks that also require the use of lists, tuples, other loop structures and imports. The required approaches need to become increasingly more structured and require subdividing into multiple files.