Mar 5 – 7, 2024
Julius-Maximilians-Universität Würzburg
Europe/Berlin timezone

Combining file-based with RDMS-based scientific workflows using the LinkAhead-crawler

Mar 6, 2024, 2:10 PM
20m
HS5

HS5

Talk (15min + 5min) Research Software for Computing and Visualising Text Research Software for Computing and Visualising Text

Speaker

Dr Alexander Schlemmer (Max Planck Institute for Dynamics and Self-Organization, Göttingen)

Description

While research data management systems (RDMSs) provide many benefits for scientists, data integration is still one of the major bottlenecks for the adoption of an RDMS. Especially the omnipresent dependency on file-based digital workflows and the strong heterogeneity of file and data layouts pose important challenges. We have developed a crawler-based concept [1] that allows us to combine file-based digital workflows with RMDS-software in a way that they can be used simultaneously. Furthermore, the concept includes a flexible configuration of data integration procedures in a YAML-based format that facilitates its application to different use cases. We demonstrate how to apply these concepts practically using the LinkAhead-crawler framework (CaosDB was recently renamed to LinkAhead). The software is published as Open Source software under AGPLv3 and can be accessed online (https://gitlab.com/linkahead/linkahead-crawler).

[1] Tom Wörden, H.; Spreckelsen, F.; Luther, S.; Parlitz, U.; Schlemmer, A. Mapping hierarchical file structures to semantic data models for efficient data integration into research data management systems. Preprints 2023, 2023081170. https://doi.org/10.20944/preprints202308.1170.v1

Primary authors

Henrik tom Wörden (IndiScale GmbH) Florian Spreckelsen (IndiScale GmbH) Prof. Stefan Luther (Max Planck Institute for Dynamics and Self-Organization, Göttingen) Prof. Ulrich Parlitz (Max Planck Institute for Dynamics and Self-Organization, Göttingen) Dr Alexander Schlemmer (Max Planck Institute for Dynamics and Self-Organization, Göttingen)

Presentation materials