Data processing with Pandas & Data plotting with Matplotlib

Europe/Berlin
online

online

Description

organized in cooperation of Helmholtz Federated IT Services (HIFIS) and Helmholtz Information & Data Science Academy (HIDA)

Data processing with Pandas & Data plotting with Matplotlib

The following contents await you during this workshop:

This course days will give a hands-on, fundamental introduction to the data processing framework Pandas and the data plotting framework Matplotlib. These frameworks are written in Python and very popular in all data science areas thanks to their wide variety of functionality and usability.

All workshop days cover alternating sequences of theoretical input and hands-on exercises, during which the instructors are available for quick feedback and advice.

Prerequisites: Basic knowledge of the Python language (variables, functions, loops, conditions).

Course Times

Oct 16, 2024, 10:00am - 04:00pm

Oct 17, 2024, 10:00am - 04:00pm

Oct 18, 2024, 10:00am - 02:00pm

 

NOTE: Registration will open September 18th, 2024, 12 pm. 

Additional Information

The course content is coordinated, so we strongly recommend that you do not miss any part of the course. To receive a certificate we expect full time and active participation.

Your registration for this course is binding. If you need to leave/miss the course for a period of time, please let us know in advance.

There is no waiting list for this course! If someone withdraws from a course, their place is automatically reopened. We therefore advise you to keep an eye on the registration in case the course is full and you would like to attend. Also, this course will be offered again in the future - you can check our HIDA Course Catalog for updates. 

This course is free of charge. 

    • 10:00 10:30
      Welcome and Introduction (Pandas) 30m
    • 10:30 11:30
      Introduction to Pandas
      • 10:30
        The Series Datatype 30m

        Learn about the fundamental data type used for labelled sequential data.

      • 11:00
        Introduction to DataFrames 30m

        Get to know a more advanced data type to deal with multi-dimensional data.

    • 11:30 12:30
      Lunch 1h
    • 12:30 14:00
      Introduction to Pandas
      • 12:30
        Accessing Data 30m

        Learn about multiple ways to access subsections of a Series or a DataFrame and singular elements in it.

      • 13:00
        Filtering Data 30m

        Learn how to create filter masks to separate out the interesting data.

      • 13:30
        Modifying Data 30m

        Learn how to add and manipulate existing data in a DataFrame or Series.

    • 14:00 14:30
      Coffee 30m
    • 14:30 16:00
      Introduction to Pandas
      • 14:30
        Hands-on Exercise (Pandas) Loading Data 1h 30m

        Try out your newly learned skills on a real-live data set.
        In this section, you will learn how to load a complex data set and re-shape it into a usable form based on the provided metadata. The instructors are available to give advice and feedback.

    • 10:00 10:30
      Welcome and Introduction (Matplotlib) 30m
    • 10:30 11:30
      Introduction to Matplotlib
      • 10:30
        Introduction to Pyplot 30m

        Pyplot is a collection of shortcuts for common tasks. Learn how to use it to create basic plots.

      • 11:00
        Multiple Plots 30m

        Learn how to create multiple plots and arrange them in a single figure.

    • 11:30 12:30
      Lunch 1h
    • 12:30 14:00
      Introduction to Matplotlib
      • 12:30
        Object-oriented Style 45m

        Learn about an alternative approach to setting up plots.

      • 13:15
        Matplotlib + Pandas 45m

        Learn how you can combine Pandas and Matplotlib for quick plotting.

    • 14:00 14:30
      Coffee 30m
    • 14:30 16:00
      Introduction to Pandas
      • 14:30
        Hands-on Exercise (Pandas) Cleaning Data 1h 30m

        The loaded data set still has quite a few inconsistencies, missing values and formatting peculiarities that need to be dealt with before it is ready for analysis.

    • 10:00 11:30
      Introduction to Pandas
      • 10:00
        Hands-on Exercise (Pandas) Analyzing Data 1h 30m

        Analyze the now cleaned data set to extract new knowledge about the climatic conditions at the chosen location.

    • 11:30 12:30
      Lunch 1h
    • 12:30 14:00
      Introduction to Matplotlib
      • 12:30
        Hands-on Exercise (Matplotlib) 1h 30m

        Try out your newly learned skills on a set of real-live use cases. The instructors are available to give advice and feedback.