4–6 Nov 2024
virtual event
Europe/Berlin timezone

3. The Two Motors of the DataHub Initiative for Environmental Sciences: a Powerful FAIR and Open Research Data Infrastructure together with Joint Semantic Metadata Schemas.

Instructors: Marc Hanisch, Christof Lorenz, Ulrich Loup

Date: 6 Nov 2024
Time: 9:00-13:00 CET
Room: 5

Format: Tutorial with presentations and hands-on-demonstrations

In environmental sciences, time-series data is crucial for monitoring environmental processes, validating earth system models and remote sensing products, training of data driven methods and better understanding of climate processes. However, even today, there is no uniform standard and interface for making such data consistently available according to the FAIR principles. Therefore, within the DataHub initiative, seven research centers from the Helmholtz research field Earth & Environment initiated the HMC project STAMPLATE. The aim of STAMPLATE is to adopt the SensorThings API (STA) from the Open Geospatial Consortium as the main framework and interface through which such data is made accessible.

Since project start in 2023, there have been numerous side activities and initiatives, which led to the establishment of a full digital ecosystem for time-series data, built around the STA. This ecosystem includes tools for the management of sensor metadata, quality-control of observational data, the consolidation and visualization via an overarching (meta)data portal and fully automated data pipelines connecting all these tools for a simple and user-friendly publication of data according to the FAIR principles.

The challenging task of the STAMPLATE committee was to harmonize the extremely heterogeneous metadata formats stemming from the different observation domains such as the earth, atmosphere and ocean. Moreover, within the domains different metadata formats developed historically due to diverging system architectures and missing guidelines.

Main content:

  • Presentation of the architecture of our ecosystem
  • Introduction to the STA as generic and modern interface for time-series data
  • Presentation of the work on metadata homogenization
  • Presentation and hands-on-tutorials of integrated tools and sub-systems

Agenda

  • Welcome, introduction and overview (10 minutes)
  • Introduction to the DataHub and its digital ecosystem (30-45 minutes)
  • Demonstration of a use case from different views: data ingest, metadata management, result in the portal (60 min)
  • STA and JSON schemas and how they impact this pipeline (30 min)
  • The special role of sensor metadata systems like the Sensor Management System (SMS) or the O2A-registry as well as Central Vocabularies (CVs) and curation workflows (30 min)
  • Discussion and outlook (45 minutes)

Goal

In this workshop, we want to give an overview of this ecosystem, the integrated services and features, present the current status, and give hands-on-tutorials for selected tools. Furthermore, we demonstrate the importance of a metadata consent.

Target group

All researchers, technicians and data professionals whose work is related to time-series data, particularly from any kind of sensor system

Prerequisites 

None

Registration

Register here